Skip to content

Commit 3abb847

Browse files
Merge pull request #331 from SylvainCorlay/fix-typos
Fix typo fixes
2 parents 5214c2b + 79e8657 commit 3abb847

2 files changed

Lines changed: 4 additions & 4 deletions

File tree

src/components/fundable/descriptions/Decimal32InArrowCpp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.
44

5-
Fixed-width decimal data in Arrow is usually represented the Decimal128 data type.
5+
Fixed-width decimal data in Arrow is usually represented by the Decimal128 data type.
66
This data type has non-trivial memory costs (16 bytes per value) and computational costs (operations on 128-bit integers must be emulated on most if not all architectures).
77

88
Arrow recently gained Decimal32 and Decimal64 data types which, as their names suggest, encode fixed-width decimal data more compactly.

src/components/fundable/descriptions/ParquetNullOptimizations.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Apache Parquet is an open source, column-oriented data file format designed for
44
efficient data storage and retrieval. Together with Apache Arrow for in-memory data,
5-
it has become for the *de facto* standard for efficient columnar analytics.
5+
it has become the *de facto* standard for efficient columnar analytics.
66

77
While Parquet and Arrow are most often used together, they have incompatible physical
88
representations of data with optional values: data where some values can be
@@ -18,7 +18,7 @@ the data is declared nullable (optional) at the schema level.
1818
We propose to optimize the conversion of null values from Parquet in Arrow C++
1919
for flat (non-nested) data:
2020

21-
1. decoding Parquet definition levels directly into a Arrow validity bitmap, rather than using an
21+
1. decoding Parquet definition levels directly into an Arrow validity bitmap, rather than using an
2222
intermediate representation as 16-bit integers;
2323

2424
2. avoiding decoding definition levels entirely when a data page's statistics shows
@@ -27,7 +27,7 @@ for flat (non-nested) data:
2727
As a subsequent task, these optimizations may be extended so as to apply to schemas
2828
with moderate amounts of nesting.
2929

30-
This work will benefit to applications using Arrow C++ or any of its language
30+
This work will benefit applications using Arrow C++ or any of its language
3131
bindings (such as PyArrow, R-Arrow...).
3232

3333
Depending on the typology of Parquet data, this could make Parquet reading 2x

0 commit comments

Comments
 (0)