Data & ML Engineering Archive

Avro Vs CSV: Which Data Serialization Format Is Right For You?

Avro vs. CSV – two data serialization formats used to store and transmit data. But which one should you use? This blog post takes a look at the distinctions between Avro and CSV with respect to architecture, execution, and applications and gives advice for when each format is most suitable. Table of Contents: Structure

Parquet Vs. Avro: Choosing Between two Serialization Formats

Parquet and Avro are two popular open-source file formats used for serializing large datasets in big-data environments. Parquet was developed by Cloudera in collaboration with Twitter as an efficient columnar storage format optimized for distributed computing platforms such as Apache Hadoop. Avro was developed by Apache Software Foundation as a binary serialization system designed