Latest News for: Open source parquet


Exclusive: Voltron Data brings new power to AI with Theseus distributed query engine

Venture Beat 01 Dec 2023
Voltron Data, which made its splashy debut in 2022 with $110 million in funding, is all about bringing the power of multiple open source technologies, including Apache Arrow, Apache Parquet and Ibis, together to help improve data access.

Microsoft announces Mirroring, a way to copy databases

Venture Beat 15 Nov 2023
Mirroring essentially allows customers to replicate existing cloud data warehouses and databases into the open source Parquet and Delta formats, for direct use within Fabric’s warehouse experience – even when those external databases have proprietary data formats ... CSV and Parquet.

Python Pandas creator Wes McKinney joins Posit

InfoWorld 06 Nov 2023
Along with being known for the pandas data analysis library, McKinney has worked on other open-source projects including Apache Arrow, Apache Parquet, and Ibis.

How Microsoft Fabric aims to beat Amazon and Google in the cloud war

Venture Beat 06 Jun 2023
This is an open-source file format that is widely used in the industry and that organizes data by columns ... OneLake stores its data tables in an open-source format ... “Microsoft has done the right thing here,” said Tony Baer, an analyst at DBInsights, of its embrace of open source.

Microsoft’s data and analytics platform Fabric announces unified pricing, pressuring Google and Amazon

Venture Beat 01 Jun 2023
A singular data lake based on an open format ... OneLake is built around the open-source Apache Parquet format, allowing for a unified way to store and retrieve data natively across databases ... going to run data that shares the same open format, across both Synapse and Power BI.”.

Informatica Announces Integration of its Intelligent Data Management Cloud with Microsoft Fabric

The Joplin Globe 23 May 2023
... language processing.Data Democratization for all users to rapidly discover, identify and gain access to trusted, reliable, quality data through Informatica’s Cloud Data MarketplaceData integration from 200-plus sources into the open and governed OneLake Delta Parquet format.

Microsoft launches Fabric, a new end-to-end data and analytics platform

Crunch 23 May 2023
“There’s literally hundreds — if not thousands — of products and open source ... In part, that’s because the team decided to build the central data lake around the open-source Apache Parquet format, a column-oriented file format for data storage and retrieval.

Databricks acquires Okera to boost its AI-driven data governance platform

Venture Beat 04 May 2023
Xin noted that Nong Li, Okera’s co-founder and CEO, is widely known for creating Apache Parquet, which is an open-source standard storage format that Databricks and the rest of the industry builds on. Li has also previously worked at Databricks and led the vectorized Parquet and ...

Databricks and Hugging Face integrate Apache Spark for faster AI model building

Venture Beat 27 Apr 2023
Traditionally, users had to write data into parquet files — an open-source columnar format, and then reload them using Hugging Face datasets ... Databricks aims to support the open-source community through the new release, saying that Hugging Face excels in delivering open-source models and datasets.

InfluxData releases InfluxDB 3.0 product suite for time series analytics

Venture Beat 26 Apr 2023
Founded in 2012, InfluxData has been developing an open source based time series database, written in the Go programming language ... written in the open source Rust programming language ... With InfluxDB 3.0, the database is now making use of the open source Apache Parquet file format.

An open data lakehouse will maintain and grow the value of your data

Venture Beat 25 Mar 2023
Keeping it “open” (using open-source technologies and standards like PrestoDB, Parquet and Apache HUDI) not only saves money on license costs, but also gives your organization the reassurance that the technology that backs these critical systems is being continuously developed by companies that use it in production and at scale.

No storage, no cry: Sinking the data storage barrier

Venture Beat 19 Mar 2023
The third stage — modern open-source formats like Parquet and Iceberg, which more effectively collect compressed files — resulted from the fact that the capacity of these databases was outpaced by the data they were tasked to collect and analyze ... Combined with open-source formats ...

TigerGraph Expands Cloud Offering with Powerful New Capabilities to Streamline Adoption of ...

The Galveston Daily News 01 Mar 2023
New capabilities include.. Enhanced data ingestion. Simplified streaming data ingestion setup and support for the popular Parquet data format with enhanced progress monitoring and messages.Parquet file format. Added support for the de facto open source storage format for big data as a data source.Multi-edge support ... Helpful Links ... About TigerGraph ... ....

Google Cloud launches BigLake, a new cross-platform data storage engine

Crunch 06 Apr 2022
Google notes that BigLake will provide fine-grained access controls and that its API will span Google Cloud, as well as file formats like the open column-oriented Apache Parquet and open-source processing engines like Apache Spark.

Starburst to Host Inaugural Trino Summit on October 21-22, 2021

Daily Times Chronicle 14 Oct 2021
... optimization, including the use of rank() window function and querying Parquet data for files containing column indexes.Increased interoperability with other open source projects through the ClickHouse connector, BigQuery timestamp fixes, and updates to the Iceberg connector.