Question 1

What is a .ORC file?

Accepted Answer

ORC (Optimized Row Columnar) is a columnar big-data format from the Hadoop world. ORC was created to store Hive data tables compactly and query them fast. ORC was announced by Hortonworks together with Facebook to overcome the limits of the earlier RCFile and speed up Apache Hive queries. Unlike RCFile, which treated each column as an opaque blob, ORC retains the table’s type information and writes one file per task.

Question 2

How do I open a .ORC file?

Accepted Answer

Drop a .ORC file onto Analyser at https://lab.valjdakosta.com/ and it is identified directly in your browser - no upload, no account and no software to install. Identify and read developer and data-serialisation files: dependency lockfiles (npm/Yarn/pnpm/Cargo/Poetry/Bundler/Composer - locked-package count), binary serialisations (MessagePack, CBOR, BSON, raw Protobuf messages and descriptor sets), Python pickles with a security note, NumPy .npz and Java jar/war/ear archives, IDL schemas (FlatBuffers/Thrift/Cap n Proto/HCL), MATLAB MAT-files, Redis RDB dumps and columnar big-data containers (Apache Arrow/Feather, Parquet, ORC). The JSON supersets JSON5/JSONC/Hjson now open in a full viewer - see Notebooks & data above.