G+: One programming that perpetually bothers me is network …

David Coles
One programming that perpetually bothers me is network protocol parsing. While not to hard to handle one or two special cases, a general approach is much more difficult. To confuse the mater, you quickly run into issues like message serialization and handling asynchronous communication.

Initially I had tried representing protocols as a nested structure (like Russian dolls), which works for representing simple packet structures, but falls apart when confronted with situations like HTTP where one HTTP message might be represented by one or more TCP packets (fragmentation) or maybe sharing a TCP packet with other HTTP messages (packing), as well as having the issue of having no where to store protocol state.

Currently the best way I'm seen of doing protocol parsing is to think of it as a  stack of protocol layers where the lower layers refine information before parsing it higher layers. This is the approach that Twisted takes where by a Protocol consumes bytes via the `dataReceived(bytes)` method until enough data has been received to pass the data to one or my higher level layers.

If you've seen better ways of thinking about network protocols, I'd love to hear about it.

Patrick McLean
Layers is the classical/standard way to think about network protocols.

David Coles
Well, the layers aren't really the confusing part. It's the messages that go between the layers and creating a suitable class representation of the whole thing.

For instance, how does the TCP layer know to pass messages for port 80 up to the HTTP layer. Probably either need some sensible defaults and/or a way of adding a higher layers to handle particular ports.

David Watt
Is HTTP a layer now? I guess theoretically it is, but there's no requirement that TCP traffic for port 80 be interpreted as such. You decide you want to interpret port 80 traffic as HTTP traffic when you run a web server. Hence, you assume browsers communicate as such, & throw out everything that doesn't conform.
There are conventions, but no defaults, I'd say.

David Coles
+David Watt In the Internet Protocol suite, HTTP is considered part of the application layer, but I'm more talking about the more general case of "protocol layering" than the OSI or IP models.

You can still can think of it as a layer (for example you can put XMPP on top of HTTP) to add some higher level feature, though they're not as formalized.

As for defaults, you're exactly right. There's no hard requirement on mappings, just it does help a lot for interoperability if you follow the IANA associations. Even at the Transport and Internet layers there's nothing special about these protocols except that they're built into the kernel - infact you can even use IP_RAW to add your own Transport layer protocol (often done for userspace protocol implementaions or handling more obscure protocols).