AI.news
主页教程研究工具模型AI创业讨论新闻WIKI🚀 创业库★ 投稿
AI+医疗机器人教育金融能源健康娱乐思考

GitHub - arunkatherashala/Kore

github.com
分享到

🚀 Kore — Killer Optimized Record Exchange

The fastest, most compressed columnar format for big data | v0.1.0

KORE is a high-performance binary file format optimized for analytical workloads. It provides:

  • 38% compression ratio (vs 63% for Parquet)
  • 131x query speedup with column pruning & predicate pushdown
  • Zero data loss verification (400K+ cells tested)
  • Native Spark integration — read/write with PySpark

Quick Start

Rust Library

Add this crate as a dependency (when published) or include from path:

use kore_fileformat::*;

// Write data
kore_write_simple("output.kore", schema_json, data_json)?;

// Read data
let data = kore_read_simple("output.kore")?;

// Read specific column
let col = kore_read_col_simple("output.kore", "column_name")?;

// Get file info
let info = kore_info_simple("output.kore")?;

PySpark Integration ⭐ NEW

from pyspark.sql import SparkSession
from kore import KoreDataFrameReader, KoreDataFrameWriter

spark = SparkSession.builder.appName("KoreExample").getOrCreate()

# Read Kore file
df = KoreDataFrameReader(spark).load("data.kore")

# Write to Kore (38% compression!)
KoreDataFrameWriter(df).mode("overwrite").save("output.kore")

# Spark SQL support (3.5+)
spark.read.format("kore").load("file.kore").show()

See python/README.md for full PySpark documentation.

Publishing checklist

  • Ensure Cargo.toml metadata is correct (authors, repository, keywords).
  • Add LICENSE file if required (MIT by default here).
  • Replace any unimplemented!() stubs with full implementations if you need runtime functionality.
  • Run cargo build --release and cargo test to verify compilation and tests.
  • Optionally add CI configuration (GitHub Actions) for cargo test and cargo clippy.

Notes

This workspace contains copies of the original KORE source files. Some long implementations were stubbed out in this initial export; if you want the full original source code included verbatim, I can replace the stubs with the complete implementations from the upstream project files.