group

Utilities for collating and grouping records

Please also see:

See this stackoverflow for someone asking why couldn't Numpy be written in JavaScript;

  • D3, specifically: group / rollup / index and flatGroup / flatRollup

  • DanfoJS - a js library heavily inspired by Pandas so someone familiar with Pandas can get up to speed very quickly

  • dataframe-js - provides an immutable data structure for DataFrames which allows to work on rows and columns with a sql and functional programming inspired api.

  • StdLib - is a great library that compiles down to C/C++ level to provide speeds comparable to Numpy.

  • NumJS is also a great number processing library. It may not be as fast as StdLib, but it can sometimes be easier to use.

Methods

(static) by(collection, key, …key) → {SourceMap}

Group a collection into multiple levels of maps.

Parameters:
Name Type Attributes Description
collection Array

Array of objects or two dimensional array

key String | Number

the key to group the collection by

key String <repeatable>

the additional keys to group the collection by

Returns:
  • collection of results with the source as the key used for that level

For example:

initializeWeather = () => [
  { id: 1, city: 'Seattle',  month: 'Aug', precip: 0.87 },
  { id: 0, city: 'Seattle',  month: 'Apr', precip: 2.68 },
  { id: 2, city: 'Seattle',  month: 'Dec', precip: 5.31 },
  { id: 3, city: 'New York', month: 'Apr', precip: 3.94 },
  { id: 4, city: 'New York', month: 'Aug', precip: 4.13 },
  { id: 5, city: 'New York', month: 'Dec', precip: 3.58 },
  { id: 6, city: 'Chicago',  month: 'Apr', precip: 3.62 },
  { id: 8, city: 'Chicago',  month: 'Dec', precip: 2.56 },
  { id: 7, city: 'Chicago',  month: 'Aug', precip: 3.98 }
];
weather = initializeWeather();

utils.group.by(weather, 'city')

// provides

SourceMap(3) [Map] {
  'Seattle' => [
    { id: 1, city: 'Seattle', month: 'Aug', precip: 0.87 },
    { id: 0, city: 'Seattle', month: 'Apr', precip: 2.68 },
    { id: 2, city: 'Seattle', month: 'Dec', precip: 5.31 }
  ],
  'New York' => [
    { id: 3, city: 'New York', month: 'Apr', precip: 3.94 },
    { id: 4, city: 'New York', month: 'Aug', precip: 4.13 },
    { id: 5, city: 'New York', month: 'Dec', precip: 3.58 }
  ],
  'Chicago' => [
    { id: 6, city: 'Chicago', month: 'Apr', precip: 3.62 },
    { id: 8, city: 'Chicago', month: 'Dec', precip: 2.56 },
    { id: 7, city: 'Chicago', month: 'Aug', precip: 3.98 }
  ],
  source: 'city'
}

or using multiple groups: utils.group.by(weather, 'month', 'city')

provides:

SourceMap(3) [Map] {
  'Aug' => SourceMap(3) [Map] {
    'Seattle' => [ [Object] ],
    'New York' => [ [Object] ],
    'Chicago' => [ [Object] ],
    source: 'city'
  },
  'Apr' => SourceMap(3) [Map] {
    'Seattle' => [ [Object] ],
    'New York' => [ [Object] ],
    'Chicago' => [ [Object] ],
    source: 'city'
  },
  'Dec' => SourceMap(3) [Map] {
    'Seattle' => [ [Object] ],
    'New York' => [ [Object] ],
    'Chicago' => [ [Object] ],
    source: 'city'
  },
  source: 'month'
}
Type
SourceMap

(static) index(collection, indexFn) → {Map}

Index a collection of records to a map based on a specific value.

Unlike group.by, only one indexing function is accepted.

This is very helpful for joining records of two separate groups.

Example
athletes = [
  {name: "Neymar", sport: "Soccer", nation: "Brazil", earnings: 90},
  {name: "LeBron James", sport: "Basketball", nation: "United States",  earnings: 85.5},
  {name: "Roger Federer", sport: "Tennis", nation: "Switzerland", earnings: 77.2},
];

facts = [
  {about: "Neymar", fact: "Neymar is Neymar da Silva Santos Júnior"},
  {about: "Roger Federer", fact: "Federer has won 20 Grand Slam men's singles titles"},
  {about: "Megan Rapinoe", fact: "Rapinoe was named The Best FIFA Women's Player in 2019"}
];

athletesByName = utils.group.index(athletes, 'name');
facts.map(({about: name, ...rest}) => ({...rest, name, ...athletesByName.get(name)}))

// [
//   {
//     fact: 'Neymar is Neymar da Silva Santos Júnior',
//     name: 'Neymar', sport: 'Soccer', nation: 'Brazil', earnings: 90
//   },
//   {
//     fact: "Federer has won 20 Grand Slam men's singles titles",
//     name: 'Roger Federer', sport: 'Tennis', nation: 'Switzerland', earnings: 77.2
//   },
//   //-- not found
//   {
//     fact: "Rapinoe was named The Best FIFA Women's Player in 2019",
//     name: 'Megan Rapinoe'
//   }
// ]
Parameters:
Name Type Description
collection Array

Collection of objects to index by a specific field or value

indexFn function | String

the propert name or function evaluating to a value for the index

Returns:
Type
Map

(static) rollup(collection, reducer, prop, …fields) → {SourceMap}

See:

Group and "Reduce" a collection of records.

(Similar to d3 - rollup)

Example
const data = [
weather = [
  { id: 1, city: 'Seattle',  month: 'Aug', precip: 0.87, dateTime: new Date(2020, 7, 1)  , year: 2020},
  { id: 2, city: 'Seattle',  month: 'Dec', precip: 5.31, dateTime: new Date(2020, 11, 1) , year: 2020},
  { id: 0, city: 'Seattle',  month: 'Apr', precip: 2.68, dateTime: new Date(2021, 3, 1)  , year: 2021},
  { id: 4, city: 'New York', month: 'Aug', precip: 4.13, dateTime: new Date(2020, 7, 1)  , year: 2020},
  { id: 5, city: 'New York', month: 'Dec', precip: 3.58, dateTime: new Date(2020, 11, 1) , year: 2020},
  { id: 3, city: 'New York', month: 'Apr', precip: 3.94, dateTime: new Date(2021, 3, 1)  , year: 2021},
  { id: 7, city: 'Chicago',  month: 'Aug', precip: 3.98, dateTime: new Date(2020, 7, 1)  , year: 2020},
  { id: 8, city: 'Chicago',  month: 'Dec', precip: 2.56, dateTime: new Date(2020, 11, 1) , year: 2020},
  { id: 6, city: 'Chicago',  month: 'Apr', precip: 3.62, dateTime: new Date(2021, 3, 1)  , year: 2021}
];

utils.group.rollup(weather, (collection) => collection.length, 'city')

// SourceMap(3) [Map] {
//   'Seattle' => 3,
//   'New York' => 3,
//   'Chicago' => 3,
//   source: 'city'
// }

utils.group.rollup(weather, r => r.length, 'city', 'year')

//  SourceMap(3) [Map] {
//   'Seattle' => SourceMap(2) [Map] { 2020 => 2, 2021 => 1, source: 'year' },
//   'New York' => SourceMap(2) [Map] { 2021 => 1, 2020 => 2, source: 'year' },
//   'Chicago' => SourceMap(2) [Map] { 2021 => 1, 2020 => 2, source: 'year' },
//   source: 'city'
// }
Parameters:
Name Type Attributes Description
collection Array

Collection to be rolled up

reducer function

{(Array) => any} Function to reduce the group of records down

prop String

The property on the objects to group by

fields any <repeatable>

Additional fields to group by

Returns:
  • a reduced sourceMap, where only the leaves of the groups are reduced
Type
SourceMap

(static) separateByFields(collection, …fields) → {Array}

To Do:
  • - Vega needs series on separate records

Vega needs the series on separate objects.

Each object then made per group leaf collection, preserving the groups used to make it.

The object generated by the function is then merged.

See vega-lite fold transform

Example
aggregateWeather = utils.group.by(weather, 'city')
  .reduce((group) => ({
    min: utils.agg.min(group, 'precip'),
    max: utils.agg.max(group, 'precip'),
    avg: utils.agg.avgMean(group, 'precip')
  }));

//-- gives

[
  { city: 'Seattle', min: 0.87, max: 5.31, avg: 2.953 },
  { city: 'New York', min: 3.58, max: 4.13, avg: 3.883 },
  { city: 'Chicago', min: 2.56, max: 3.98, avg: 3.387 }
]

separateByFields(aggregateWeather, 'min', 'max', 'avg')

//-- gives
[
  { city: 'Seattle', min: 0.87, max: 5.31, avg: 2.953,  key: 'min', value: 0.87 },
  { city: 'New York', min: 3.58, max: 4.13, avg: 3.883, key: 'min', value: 3.58 },
  { city: 'Chicago', min: 2.56, max: 3.98, avg: 3.387,  key: 'min', value: 2.56 },
  { city: 'Seattle', min: 0.87, max: 5.31, avg: 2.953,  key: 'max', value: 5.31 },
  { city: 'New York', min: 3.58, max: 4.13, avg: 3.883, key: 'max', value: 4.13 },
  { city: 'Chicago', min: 2.56, max: 3.98, avg: 3.387,  key: 'max', value: 3.98},
  ...
]
Parameters:
Name Type Attributes Description
collection Array

array of objects

fields any <repeatable>

string field name to separate by

Returns:
Type
Array