About Me!

This blog is about my musings and thoughts. I hope you find it useful, at most, and entertaining, at least.

Résumé [PDF]

Other Pages

Quotes

Links

Presence Elsewhere

jim@jimkeener.com

GitHub

BitBucket

TDF -- Text Data File

Date: 2013-08-06
Tags: computers protocols data programming

In a previous post I describe a binary file-format for the exchange of data. Here I would like to describe a format, but one limited to 7-bit ASCII and editable in a text-editor (without plugins).

Format

Delimeters: , and ;

These can be escaped by:

, $$
; %%
$ $%
% %$

I’ve chosen these so that tools such as awk can still be used to parse files that contain delimiters inside the values.

delimits types and encodings in header lines


Metadata-header: value

Field Name[;Type[,Type]&2a;][;Format], (Field Name[;Type[,Type]&2a;][;Format])*
value(,value)*
value(,value)*
….

Examples


Last-Modified: 2013-08-06T08:52:00EST

Value;Int8;Hex,Square;Int8;Hex
00,00
01,01
02,04
03,09
04,10
05,19
06,24
07,31
08,40
09,51
0A,64

Value;Int,Bin8;Dec,Square;Int,Bin8;Dec
0,0
1,1
2,2
3,9
4,16
5,25
6,36
7,59
8,64
9,81
10,100
Value,Square
0,0
1,1
2,2
3,9
4,16
5,25
6,36
7,59
8,64
9,81
10,100

The last one looks familiar, huh?

Metadata

Metadata will be stored in RFC 2616 Header format (HTTP/1.1 header syntax).

I propose the following headers:

Header value Example Comment
Digest digest base64-hash sha1 yaUxx5mrIRyXNdovreYa/PFh0PE= Calculated without this line
Last-Modified ISO 8601 date 2013-08-06T08:52:00EST
Signature user;key;signature
Jim <jim@example.com>;MIHbAgEBBEF2cZ7A1w1t5+IOCJHxAzQtHKj1Z4TP0/kRmu5iKuHnlcL38zd2Bs8yGDmpuN7YpsmDtUG/pCMJ96wh5GzP37qfO6AHBgUrgQQAI6GBiQOBhgAEAE0dAxG5UM3ol75T2CM2ukAX4PnE/ZsZ5x0iUtGT66lhC75GrbWPVb1drZpVybK6ZISTYgvb2PYeHDSL13YyLj/0AF/muMRVEt7WXSGJh1j6+RMewktpJcqFdbpnrNhJ9JcQSAHkPXhGb7GZby84m9Q66FaAuYJs1VgXdyjaTVTSBHyb;MIGHAkER9CmV5WJPB3hnk9eD31oqhAKWTsXVKubdIffMM9ocjU667p5yDh8xrOuOx0T8xx2NTQgmnDgsrPaXLK8WiMEaaQJCAYn2TwWkSVpgTM7oFg3O6r9ZTSRTnqZhxyk3g7O1SDHcqxohBREITiMsIFFNjv6m6sj/M8e4ndlaHZVgv5J/T+NR
Because of their size, EC keys are useful for this. Calculated without this line.

Types

Types may be chained, e.g. Bin32,Float

Type Comment
Bin8
Bin16
Bin32
Bin64
Bin128
Bin512
Bin1024
Bin2046
Bin4096
Bin8196
Integer 2s Compliment Integer
Float IEEE 754
UUID UUID
UDT Unix Date Time
UNT Unix Date Time (Nanoseconds)
EDT Excel Date Time
TXT Text, with delimited separators

Formats

Fomat Comment
Dec Decimal (default)
Spec Types specified format
Hex Hex encoded
B64 Base-64 Encoded
B85 Base-85 Encoded
Bin Binary (only usable on fixed-length fields)