HyperionGaming.org
Team Fortress 2 => Payload Server => Topic started by: n. on December 10, 2013, 11:51:16 AM
-
So I've noticed a lot of lag lately, and I'm pretty sure it's the server. The reason I think it's server side is that (a) I'm not seeing it on other servers with similar latency and (b) the net graph seems to indicate it, if I'm reading it correctly. See attached screenshot. It's my understanding that the red ticks on the lower green line indicate that the server is falling below 10fps? It was pretty bad last night.
edit: so I don't think I was reading it correctly (just re-read the wiki page on the net graph) I think it's mostly packet loss I'm seeing.
-
I don't fully understand this stuff myself but yeah, "loss" and "choke" are the 2 important parts as I understand it.
I haven't been monitoring netgraph lately but I have noticed lag myself. The game literally stops for a split second then carries on, though everyone is now in a new place from where they where beforehand.
I'm assuming it is also lag (compensation) that makes me miss when they are RIGHT in front of me and the reticle is centre of body mass. No ding, no numbers flying up, rather frustrating.
-
All I know is this isn't a normal looking graph. If I hop on another server with a ~70-80ish ping, the graph is continuous without all these gaps, and the gameplay is much smoother.
-
Well today it's much better. Not sure what changed.
edit: spoke too soon, still really bad.
-
Nothing was changed on the Payload server, I'll look at your graphs and run my own tests when I come back home later tonight. I'll update my post.
Update:
I'm not getting constant loss/choke like in your screenshot, but there is occasional stuttering. Loss and choke can often be fixed by modifying client rates, but I'll take a look at our server rates. With SteamPipe, values have changed a bit.
The occasional stuttering does occur more on Payload #1 than on Payload #2, which makes sense since Payload #1 is pretty much full 24/7. I will try restarting the machine when possible (when servers are empty), it should help a bit.
Finally, I'll try to see if I can move the servers to slightly better hardware. This requires more time, so I'll let you guys know.
--
To check server tickrate, use net_graph 4. It's the bottom "sv" value, it very rarely drops below 66.7.
If you guys get a lot of stuttering and/or heavy choke/loss, please take a few screenshots (or a video) and post them here along with time of day and which server you were playing on. That would really help me.
-
I have a question coming from a discussion on my ISP's forums.
Someone was encouraged to go from a 20Mbps line to a 50Mbps line to improve their gaming experience. (Lag.) I suggested it was bad advice, as games use very little in the way of data transfer. A (different) rep. suggested this bump would help, when I say it wont. He then agreed, and suggested it was the up speed that may be an issue.
It got me thinking, I have 0.5Mbps up speed with my 20Mbps line. He suggested that may be an issue.
So here I am, asking you folks, do you think 0.5Mbps up could be an issue? Would 2.5Mbps up be better, or do different?
-
Great, thanks for looking into it Plasma. It's definitely a sporadic thing, sometimes it's worse than others. I'll try playing around with my rates. Sometimes latency is affected as well - usually I'm in the 80s, but sometimes it jumps up to 110/120 for hours at a time and then goes back down. I would think it's my connection or ISP, but it doesn't seem to affect other servers when it's happening. I'll get some more screenshots when it's bad again.
Wolfpup: 500k up is plenty for games, as long as you've got nothing - and i mean nothing - else going on at the same time.
In general, though, it's more about latency and loss for gaming, not total bw.
-
That's what I thought too. Thanks.
Now I'm playing with cl_"whatever" commands in my autoexec to try and sort out choke. (Not terrible, but there from time to time.)
-
UPDATE: The past couple of nights the server has been much better, 0 loss pretty much all night while the server was full. This morning, the high packet loss is back. It's 12:09pm central (10am there?) and the attached screenshot shows what im seeing. I did set all my config parameters (update/cmd rate etc) back to their defaults, which seems to have fixed the accompanying choke i was getting, but the loss remains. Right after this I jumped onto another full 32player server just to make sure it wasn't me, and got no loss :(
Edit: And now it's fixed again? 12:19pm central.
-
OK this is useful because I'm not experiencing much loss at all when I'm playing.
I'll keep checking.
-
I've been monitoring net_graph myself more lately and I to have been seeing 'loss' up to 10, and 'choke' hit's 9, but mostly just in spawn. Once I leave spawn, 'choke' settles down. That's when 'loss' takes over. Though not constant, it fluctuates from 0 up to the the aforementioned high of 10. With, like n.'s screen cap, 7 being the "magic" number I see the most.
I've noticed my ping is also 18-20 higher than it has been, but that may be my settings. (cl_interp/etc.) I've been playing with different numbers and manage to get 'lerp' from 15 and orange to 30 and white. (White being preferable even at a higher number as I understand it.)
I'll continue to play around, including removing all my autoexec settings regarding this. (Pls note I get the gist of this stuff but I'm no pro that's for sure.)
-
So n., do you use an autoexec.cfg ?
Been playing with the numbers? (See below.) Or have you left it default?
I tried reverting back to default, but found it to be kinda bad. I then tried the "standard" internet recommendation using these settings:
cl_cmdrate 66
cl_updaterate 66
cl_interp 0.0152
cl_interp_ratio 1
...and it seems to be not to bad. I still get 'loss' and 'choke' but it's less prevalent and the numbers do not go as high. 'Loss' seems to hit 5 but mostly less. 'Choke' went to 4 a few times outside spawn but the game seems to work better. It feels like my shots hit more often rather than go "right through" who I am shooting. (Because on their end of the internet, they are "not really there" any more due to lag compensation.)
I still want to monitor this more but so far, those numbers seem to be the best. Oh yeah, and lerp is still white @ 30.
-
In the last few weeks my ping has gone upwards of 500+ at times. I thought it was me so I checked my connection/router but then I noticed that when the lag spikes occurred, other people were experiencing them as well. I live pretty close to San Jose so I usually have 20 or lower ping. Played today and didn't notice any further lag so the issue seems to have been remedied. J
-
I live pretty close to the servers in San Jose and I notice the server-wide lag spike too. It's rare though.
-
The San Jose location has been hit by big DDOS attacks (not our servers specifically, the location) several times this month. The connection to Internap is supposed to be upgraded to 10Gbps soon. It won't prevent all DDOS, but it should help a bit. There's unfortunately nothing I can do on my end.
These attacks are usually stopped in 10-15min max, but they will cause several packet loss (and massive lag) when they hit.
-
It's been a while, but I haven't been playing here as much lately because of the lag and wanted to revisit the issue. This same problem still persists - all I need to see it is a window with a constant ping going to the server. It'll be steady around 75ms, then about once per minute will spike up to +140+ and start dropping packets. Then after a few seconds it goes back down. The inconsistency makes it pretty much unplayable whenever this is going on (it's not all the time).
I know it's likely an issue with your host's internet connection (or one of the internet connections) and not the server itself, as it doesn't seem to affect everyone on the server when it happens. And likewise, my connection is fine to other servers while it's happening. Seems like it could still be some DDOS issue like you described (did they never upgrade?), or just some congested/flapping Internet link somewhere in my route to you guys. I'm really not sure what you guys could do about it, just wanted you to know it's still going on. Frustrating not being able to play all the time :/
Here's an example ping. If you want, I could run a traceroute to the server and then run constant pings to each hop to find out exactly which hop is causing the problem. But like I said, not sure you could do anything about it either way.
Reply from 66.151.138.182: bytes=32 time=76ms TTL=113
Reply from 66.151.138.182: bytes=32 time=75ms TTL=114
Reply from 66.151.138.182: bytes=32 time=74ms TTL=114
Reply from 66.151.138.182: bytes=32 time=75ms TTL=114
Reply from 66.151.138.182: bytes=32 time=77ms TTL=114
Reply from 66.151.138.182: bytes=32 time=109ms TTL=114
Reply from 66.151.138.182: bytes=32 time=142ms TTL=114
Reply from 66.151.138.182: bytes=32 time=145ms TTL=114
Reply from 66.151.138.182: bytes=32 time=151ms TTL=114
Reply from 66.151.138.182: bytes=32 time=146ms TTL=114
Request timed out.
Reply from 66.151.138.182: bytes=32 time=153ms TTL=114
Reply from 66.151.138.182: bytes=32 time=145ms TTL=114
Reply from 66.151.138.182: bytes=32 time=131ms TTL=114
Reply from 66.151.138.182: bytes=32 time=143ms TTL=114
Reply from 66.151.138.182: bytes=32 time=153ms TTL=114
Reply from 66.151.138.182: bytes=32 time=161ms TTL=114
Reply from 66.151.138.182: bytes=32 time=135ms TTL=114
Reply from 66.151.138.182: bytes=32 time=103ms TTL=114
Reply from 66.151.138.182: bytes=32 time=77ms TTL=114
Reply from 66.151.138.182: bytes=32 time=75ms TTL=114
Reply from 66.151.138.182: bytes=32 time=75ms TTL=114
-
Ok it's really bad right now so here is my traceroute:
root@raspberrypi:~# traceroute 66.151.138.182
traceroute to 66.151.138.182 (66.151.138.182), 30 hops max, 60 byte packets
1 192.168.0.1 (192.168.0.1) 0.697 ms 0.447 ms 0.626 ms
2 * * *
3 xe-0-0-2.164.aggr09.austtx.grandecom.net (216.82.213.158) 15.311 ms 16.220 ms 16.090 ms
4 ae0-0.aggr01.austtx.grandecom.net (24.155.121.76) 15.682 ms 15.474 ms 15.441 ms
5 * * *
6 * * *
7 * * *
8 * * *
9 ae-2-3514.edge2.Atlanta4.Level3.net (4.69.150.165) 40.719 ms 33.535 ms 33.006 ms
10 gtt-level3.Atlanta4.level3.net (4.68.63.158) 53.298 ms 54.249 ms 75.449 ms
11 xe-11-3-0.sjc10.ip4.gtt.net (89.149.182.69) 75.730 ms 75.609 ms 74.924 ms
12 internap-gw.ip4.gtt.net (77.67.70.26) 78.401 ms 78.878 ms 76.499 ms
13 border1.t3-1-bbnet1.sje003.pnap.net (66.151.144.27) 77.591 ms border1.t4-1-bbnet2.sje003.pnap.net (66.151.144.90) 75.723 ms 75.385 ms
14 v-66-151-138-182.unman-vds.internap-sj.nfoservers.com (66.151.138.182) 77.015 ms * 77.311 ms
Running continues pings to each hop, everything looks great up until the address in hop 13, which is where it starts with the lag spikes and dropped packets.
-
So I switched ISP's (back to time warner) and now that my traffic is no longer being routed through level3 my ping is much, much better and more consistent. Why in hell I was being routed through Atlanta to get from Texas to CA is beyond me.
I'd encourage anyone else having severe lag spikes to run a trace and see if you're going through level3's shit show.