Difference between revisions of "EC Protocol HOWTO"
|  (→Basic Protocol Structure) | |||
| Line 11: | Line 11: | ||
| a high level application layer.<br> | a high level application layer.<br> | ||
| − | The transmission layer consist of  | + | The transmission layer consist of two int32 values.<br> | 
| A uint32 flag specify the format of the message e.g. if the packet uses utf8 encoded numbers or is compressed by zlib.<br> | A uint32 flag specify the format of the message e.g. if the packet uses utf8 encoded numbers or is compressed by zlib.<br> | ||
| The next uint32 determines the size of the application layer data.<br> | The next uint32 determines the size of the application layer data.<br> | ||
Revision as of 12:47, 9 May 2008
Work in progress, this site is under heavy construction.
Contents
Basic Protocol Structure
Protocol definition
Short description:
EC protocol consist of two layers: a low-level transmission layer, and
a high level application layer.
The transmission layer consist of two int32 values.
A uint32 flag specify the format of the message e.g. if the packet uses utf8 encoded numbers or is compressed by zlib.
The next uint32 determines the size of the application layer data.
The application layer consists of an op-code and a tag counter,followed by a tag structure.
Transmission layer
The transmission layer is completely independent of the application layer,
and holds only transport-related information.
The transmission layer actually consists of an uint32 number, referenced below as flags,
which describes flags for the current transmission session (send/receive operation).
This four-byte value is the only one in the whole protocol, that is transmitted LSB first,
and zero bytes omitted (therefore an empty transmission flags value is sent as 0x20, not 0x20 0x0 0x0 0x0).
Bit description:
- bit 0: Compression flag. When set, zlib compression is applied to the application layer's data.
- bit 1: Compressed numbers. When set (presumably on small packets that doesn't worth compressing by zlib), all the numbers used
- in the protocol are encoded as a wide char converted to utf-8 to let some zero bytes not to be sent over the network
 
- bit 2: Has ID. When this flag is set, an uint32 number follows the flags, which is the ID of this packet. The response to this
- packet also has to have this ID. The only requirement for the ID value is that they should be unique in one session (or at
- least do not repeat for a reasonably long time.)
 
- bit 3: Reserved for later use.
- bit 4: Accepts value present. A client sets this flag and sends  another uint32 value (encoded as above, LSB first, zero
- bytes omitted), which is a fully constructed flags value, bits set meaning that the client can accept those extensions.
- No extensions can be used, until the other side sends an accept value for them. It is not defined when this value
- should be send, best is on first transfer, but can be sent any time later, even changing the previously announced flags.
 
- bit 5: Always set to 1, to distinguish from older (pre-rc8) clients.
- bit 6: Always set to 0, to distinguish from older (pre-rc8) clients.
- bits 7,15,23: Extension flag, means that the next byte of the flags is present.
- bits 8-14,16-22,24-32: Reserved for later use.
Transmission layer example:
- 0x30 0x23 <appdata> - Client uses no extensions on this packet, and indicates that it can accept zlib compression and compressed numbers.
Notes:
- Note 1: On the "accepts" value, the predefined flags must be set to their predefined values, because this can be used as a sort of a sanity check.
- Note 2: Bits marked as "reserved" should always be set to 0.
Application layer
Data transmission is done in packets. A packet can be considered as
a special tag - with no data, no tagLen field, and with the tagCount
field always present. All numbers part of the application layer are
transmitted in network byte order, i.e. MSB first.
- A packet contains the following:
- [ec_opcode_t] OPCODE
- [uint16] TAGCOUNT
- <tags>
 
In detail: The opcode means what to to or what the data fields contain.
Its type is set as ec_opcode_t, which currently is an uint8.
TagCount is the number of first level tags this packet has. Then are the
tags themselves.
- A tag consist of:
- [ec_tagname_t] TAGNAME
- [ec_tagtype_t] TAGTYPE
- [ec_taglen_t] TAGLEN
- <[uint16] TAGCOUNT>?
- <sub-tags>
- <tag data>
 
 
The ec_tagname_t is defined as an uint16, ec_taglen_t as an uint32 value
at the moment. ec_tagtype_t is an uint8. 
TagName tells what it contains (see ECcodes.h for details).
TagType sends the type of this tag (see ECPacket.h for types)
TagLen contains the whole length of the tag, including the lengths of the
possible sub-tags, but without the size of the tagName, tagType and 
tagLen fields. Actually the lowest bit of the tagname doesn't belong to the 
tagName itself, so it has to be cleared before checking the name.
Tags may contain sub-tags to store the information, and a tagCount field
is present only for these tags. The presence of the tagCount field can
be tested by checking the lowest bit of the tagName field, when it is
set, tagCount field present.
When a tag contains sub-tags, the sub-tags are sent before the tag's own
data. So, tag data length can be calculated by substracting all sub-tags'
length from the tagLen value, and the remainder is the data length, if
non-zero.
Future Changes
Future changes of the EC protocol (probably after 2.2.0) may be:
- no more \0 for string termination
- last bit of flag byte indicates a following flag byte, and so on
Resources
You get definitions of OP- and Tag-Codes at this locations in the source:
- ./src/lib/ec/[c#|cpp|java]/ECCodes.[cs|h|java]
- ./docs/EC_Protocol.txt (outdated, but much useful information)
Examples
Notes:
- aMule sends EC packets in two flavours (albeit it would understand other flag options as well), depending on the packet size.
- zlib compressed application data that doesn't use utf8 compressed numbers when decompressed.
- utf8 compressed numbers in the application data
 
- The tag size doesn't take into account the size of utf8 compressed numbers in subtags. When parsing, you may want to drop the length completely and get it by the size of the subtags + size of the value field (determined by the value type flag).
This is a packet in hex values that is send to aMule
for authorization:
00 00 00 22 //flag 00 00 00 36 //packet body length 54 02 //EC_OP_AUTH_REQ 04 //tag count c8 80 //EC_TAG_CLIENT_NAME 06 //EC_TAGTYPE_STRING 0d //value length 13 61 6d 75 6c 65 2d 72 65 6d 6f 74 65 00 //"amule-remote\0" c8 82 //EC_TAG_CLIENT_VERSION 06 //EC_TAGTYPE_STRING 07 //value length 7 30 78 30 30 30 31 00 // "0x0001\0" 04 //EC_TAG_PROTOCOL_VERSION 03 //EC_TAGTYPE_UINT16 02 //value length 2 02 00 //value is defined by EC_CURRENT_PROTOCOL_VERSION 02 //EC_TAG_PASSWD_HASH 09 //EC_TAGTYPE_HASH16 10 //value length 16 47 bc e5 c7 4f 58 9f 48 //md5 hashed password string 67 db d5 7e 9c a9 f8 08 //password "aaa" was used
c8 80 is in fact an utf8 encoded number. It decodes to 02 00 (or 512 in decimal).
As every tag code, it is shifted one bit to left to
fit in a bit that indicates the presence of subtags.
The lowest bit of 02 00 is 0; so this tag doesn't have subtags.
When we shift the value to the right one bit (or divide by 2),
we get 01 00.
That's the value that can be found in ECCodes.h.
This is a simple search request that is send without utf8 compressed numbers.
00 00 00 20 //plain format, no compression 00 00 00 21 //message length: 33 26 00 //EC_OP_SEARCH_START 01 //tag count 0e 03 //EC_TAG_SEARCH_TYPE 02 //EC_TAGTYPE_UINT8 00 00 00 17 //tag length: 23 00 02 //subtag count 0e 04 //EC_TAG_SEARCH_NAME 06 //EC_TAGTYPE_STRING 00 00 00 05 //tag length 74 65 73 74 00 //"test\0" 0e 0a //EC_TAG_SEARCH_FILE_TYPE 06 //EC_TAGTYPE_STRING 00 00 00 01 //tag length 00 //"\0" 00 //uint8 search type (local)
