pivot_wider for Datasets

This is the simplest way I could think off:

ds = Dataset[{
  <|"Name" -> "Steven","Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
  <|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, "Score" -> 70|>,
  <|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[
  GroupBy[#Name &],
  ResourceFunction["MergeByKey"][{{"Name", "Born"} -> First}]
][
  Values,
  Append[KeyDrop[#, {"Year", "Score"}],
    AssociationThread["Score_" <> ToString[#]& /@ #Year, #Score]
  ]&
]

enter image description here

If different people have different years, you might want to use KeyUnion to make the dataset square again. Documentation for MergeByKey is here


You can use the package "DataReshape.m" as shown below. (It is in my ToDo list to submit the corresponding functions to Wolfram Function Repository very soon.)

Load the package:

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/DataReshape.m"]

Here is the original dataset:

dsData = Dataset[{<|"Name" -> "Steven", "Born" -> 1980, 
     "Year" -> 2017, "Score" -> 115|>, <|"Name" -> "Steven", 
     "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>, <|
     "Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, 
     "Score" -> 70|>, <|"Name" -> "Joe", "Born" -> 1981, 
     "Year" -> 2018, "Score" -> 300|>}];
dsData

enter image description here

Let us convert it to a “proper” long form with ID variables all except “Score”:

dsLong = ToLongForm[dsData, {"Name", "Born", "Year"}, Automatic]

enter image description here

Here we change the values of the column “Variable” to have the values of the column “Year” as suffixes:

dsLong2 = 
 dsLong[All, 
  Join[KeyTake[#, {"Name", "Born", "Value"}], <|
     "Variable" -> 
      StringRiffle[{#Variable, ToString[#Year]}, "_"]|>] &]

enter image description here

Here we convert to wide form with ID variables “Name” and “Born”:

ToWideForm[dsLong2, {"Name", "Born"}, "Variable", "Value"]

enter image description here


Both Sjoerd's and Anton's answers work nicely. Upon Sjoerd's solution using MergeByKey from the Wolfram Function Repository, I just wanted to post here for completeness that one can find almost the same solution with Merge from the Wolfram Language:

ds = Dataset[{
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
  <|"Name" -> "Joe",    "Born" -> 1981, "Year" -> 2017, "Score" -> 70 |>,
  <|"Name" -> "Joe",    "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[GroupBy["Name"], Merge[#, Identity] &][Values,
  Append[Map[First, KeyDrop[#, {"Year", "Score"}]], 
    AssociationThread["Score_" <> ToString[#] & /@ #Year, #Score]] &]

Tags:

Dataset