Finding out how quickly your implementation can serve a 4-byte response body is an interested but extremely limited look at how it performs. What happens when the response body is 4k — or 100k — is often much more interesting, and more representative of how it’ll handle real-life load.
Another thing to look at is how it handles load with a large number — say, 10,000 — of outstanding idle persistent connections (opened with a separate tool). A decent, modern server shouldn’t be bothered by this, but it causes issues more often than you’d think.
I both disagree and agree with this. The part I disagree with is that testing your implementation against a 4-byte response body is not helpful. I contend that it is. If you know that you need to get X from the new server that you are testing, then the first thing I’d test is the maximum performance, which means doing the least amount of work. For a web server that may mean serving a static file that only contains ‘Hello World!’ (13 bytes).
If I can’t get a web server to reach the performance level of X using the static hello world file, then there is no way it is magically going to reach it after adding on several layers of additional work. That is why measuring the peak possible performance is important, you immediately determine if your need of X is even possible.
If your test results are over X, great, then start adding on more/larger work loads, as suggested in the post. If your tests are under X then you need to consider some server level changes. That might mean hardware changes, operating system and software tuning, or all of the above.
I had originally intended to leave this as a comment on On HTTP Load Testing, but it requires me to create an account on the site, which I have no interest in doing.