History

TIDE log Benchmark¶

*The information on this page is obsolete*

Chunk Sizes and Indices¶

We simulated "scanning" large TIDE log files with varying numbers of entries per chunk and entry sizes. By "scanning" we mean seeking to all CHNK blocks, reading the block headers and reading all entry headers of the respective CHNK s.

Number of Chunks	Number of Entries per Chunk	Size of one Entry	Filesize [GB]	Seeks and Reads	"Scan" Time [s]
100	100	1,000,000	~ 9.4	10,000	< 0.002
100	10,000	10,000	~ 9.4	1,000,000	< 0.200
100	1,000,000	100	~ 11.0	100,000,000	< 200.000

The number of seek and read operations grows linearly in the number of entry headers that have to be read. In the first two cases, the runtime reflects this very clearly. In the third cases, the overhead starts dominating and performance worsens accordingly. Almost the entire runtime consists of IOWAIT in this case (~ 15 s user, ~ 10 s system, ~ 175 s iowait).

Conclusion

For high-frequency data with small individual entries, scanning all entries of a file is unacceptable
- Indexes (via INDX blocks) are unavoidable and should probably be made mandatory; at least for certain file structures
Since all CHNK blocks have to be visited, even with indexes, it is important to accumulate enough data in each chunk
- On the other hand, periodically writing complete chunks to disc may be desirable for data safety and error recovery reasons

Observation

TIDE.num_chunks and TIDE.num_channels are never really used since the whole file has to be scanned anyway
If data integrity is a concern, a hash of the file's content could be stored instead and only verified when explicitly requested

Robotics Service Bus » RSBag

Wiki

TIDE log Benchmark¶

Chunk Sizes and Indices¶