About Me!

This blog is about my musings and thoughts. I hope you find it useful, at most, and entertaining, at least.

Résumé [PDF]

Other Pages

Quotes

Links

Presence Elsewhere

jim@jimkeener.com

GitHub

BitBucket

TDF -- Text Data File rev 3

Date: 2013-10-11
Tags: computers protocols data programming

Format

Files and text fields are in UTF8, unless otherwise specified

Delimiters (,) can be escaped by:

, $%
; %$
$ $$
% %%

Headers

Headers will be formated as RFC 2616 Headers (HTTP/1.1 header syntax). Headers should be sorted in lexical order.

Field definitions

Fields definitions contain a Name, Type, and Encoding as key:comma-separated-lists sets and separated by semicolons.

Name must correspond to a field name in the first line of the data.

Optional attributes, such as Description may be included.

Example:

Field: Name: val; Type: Int; Encoding: Dec, UTF8;

Alternatively:

Field: Name: val;
       Type: Int;
       Encoding: Dec, UTF8;

Metadata

Header value Example Comment
Digest digest hex-hash sha1 yaUxx5mrIRyXNdovreYa/PFh0PE= Calculated without this line and without expanding headers (they should each be on a single line)
Last-Modified ISO 8601 date 2013-08-06T08:52:00EST
Signature user;key fingerprint;signature
Jim <jim@example.com>;f642a8d2552281d792b52a17cbe79f3163b296f3;MIGHAkER9CmV5WJPB3hnk9eD31oqhAKWTsXVKubdIffMM9ocjU667p5yDh8xrOuOx0T8xx2NTQgmnDgsrPaXLK8WiMEaaQJCAYn2TwWkSVpgTM7oFg3O6r9ZTSRTnqZhxyk3g7O1SDHcqxohBREITiMsIFFNjv6m6sj/M8e4ndlaHZVgv5J/T+NR
Because of their size, EC keys are useful for this. Calculated without this line.
Author user
Jim <jim@example.com>
Description Description This data is awesome!
Source URL or description of the source of the data (where to go to find out more or who made it. May be repeated
http://example.com/data
or
Jim's Lab @ HisHouseU

Types

Types may be chained, e.g. Bin32,Float

Type Comment
Bin Arbitrary binary data
Integer
Float
UUID UUID
Text Text
Time Date and/or Time
CI Field represents the confidence interval for the field defined in a For field in the header definition. Field definition must also have “Offset: Min” or “Offset: Max”. Field definition must also contain a p-value field containing the p-value for this CI (0.05 => 95% interval)
Geometry Stores Geometry types

Formats

Fomat Comment Types
Dec Decimal (default) Int, Float, CI
Hex Hex encoded/Base-16 Encoded Bin, Int
B32 Base-32 Encoded Bin
B36 Base-36 Encoded Bin
B58 Base-58 Encoded Bin
B64 Base-64 Encoded Bin
B85 Base-85 Encoded Bin
UU UUEncoded Bin
XX XXEncoded Bin
UTF8 UTF-8 encoded text (Default) All
ASCII ASCII Text All
Latin1 Latin1/ISO 8859/ Text All
WKT Well-Known Text Geometry
WKB Well-Known Binary Geometry
UDT Unix Date Time Time
UMT Unix Date Time, Miliseconds /Javascript Time Time
UNT Unix Date Time, Nanoseconds Time
EDT Excel Date Time Time
WFT Windows File Time Time
ISO8601 ISO 8601 format Time


Author: Jim
Description: This data was collected with a Blah Blah Spectrometer. The procedure can be found at http://example.com/proc
Digest: sha1 583816a652dcf8365cceabfc4945c35a84e1614c
Field: Name: Abs_ci_max; Type: CI, Float; Encoding: Dec; For: Absorption; Offset: max; p-value: 0.05
Field: Name: Abs_ci_min; Type: CI, Float; Encoding: Dec; For: Absorption; Offset: min; p-value: 0.05
Field: Name: Absorption; Type: Float; Encoding: Dec; Description: Absorption at 520cm-1$% over 4 experiments
Field: Name: Time; Type: Int; Encoding: Dec; Description: Seconds from starting
Last-Modified: 2013-10-04T08:52:00EST

Time, Absorption, Abs_ci_min, Abs_ci_max
0,0.0,0.0,0.0
10,1.0,0.0,3.0
15,4.0,2.0,5.0
20,9.0,6.0,12.0
23,14.0,11.0,18.0

Note: Windows File Time is in 100-nanosecond ticks since 12:00 A.M. January 1, 1601 UTC