Framework / Markdown Tips

Its worth remarking that you may experience issues in which a plot just doesn't appear when including both constants and a plot in the same block. Take the following code, which filters penguins to find the "selected species" defined in the same block:

const selectedSpecies = "Adelie"
Plot.plot({
  marginLeft: 80,
  marks: [
    Plot.frame(),
    Plot.barX(penguins.filter(d => d.species === selectedSpecies), Plot.groupY(
      { x: "count" },
      { y: "island", x: "count" }
    ))
  ]
})

Although I'm not getting any errors, the code isn't running. That is because of the way Observable Framework reactivity and javascript cells work.

Typically, we rely on the implicity display in our code blocks, and our plots (when alone) just run. But, when you have multiple statements in a code block, or when you use a semicolon, Framework treats it as a program block rather than an expression block, which means implicit display doesn't apply.

You can fix this by separating them into two blocks, or adding a display funciton:

const selectedSpecies = "Adelie";
display(Plot.plot({
  marginLeft: 80,
  marks: [
    Plot.frame(),
    Plot.barX(penguins.filter(d => d.species === selectedSpecies), Plot.groupY(
      { x: "count" },
      { y: "island", x: "count" }
    ))
  ]
}))

Lab 2 Concepts

See State Management from week 8 notes
See Annotation from week 8 notes
See Data Joins from week 8 notes
See Color Scales from week 8 notes

Expanding on Data Joins

Let's explore another way to data join as we did in week 8 notes. What if we want to create new metrics when joining? Like a ratio?

Let's take two new files on tv viewership and production cost.

viewership.csv: viewership data since launch date
production_cost.csv: production cost per episode and total

Just exploring the data in its existing form, we could make two bar charts -- one for viewership and one for cost.

In this case, let's say we want to visualize the total value of each show. We could consider total viewers as value, but we should consider the cost of the show as well. Let's make a new metric of views per dollar spent for each show.

Join via new dataset

The first way to make this chart would be to build a new array of data that we could visualize that would include this metric that we are looking for. This is something that we could use AI to help us generate.

I am using Observable Plot to render charts in an Observable Framework dashoard. I need help creating a javascript function that can return a new dataset that can then be passed to the Plot function. I need a new dataset that includes each show and the (total) views per (total) dollars of each show.

The production data looks like this:

show_name,cost_per_episode_usd,number_of_episodes,total_production_cost_usd
Stranger Things,12000000,42,504000000
The Crown,13000000,60,780000000
Wednesday,8500000,8,68000000
...

The viewership data looks like this:

show_name,avg_weekly_viewers,launch_date,weeks_since_launch,total_estimated_viewership
Stranger Things,15200000,2016-07-15,486,7387200000
The Crown,9800000,2016-11-04,470,4606000000
Wednesday,18500000,2022-11-23,154,2849000000
...

Help me use javascript functions to turn this into one object with the show name and views per dollar.

I tried this, and the answer I got back was:

function calculateViewsPerDollar(productionData, viewershipData, sortDescending = true) {
  // Create a map of viewership data for quick lookup
  const viewershipMap = new Map(
    viewershipData.map(show => [show.show_name, show])
  );
  
  // Merge datasets and calculate views per dollar
  const result = productionData
    .map(show => {
      const viewership = viewershipMap.get(show.show_name);
      
      // Skip if no matching viewership data
      if (!viewership) return null;
      
      const totalCost = show.total_production_cost_usd;
      const totalViews = viewership.total_estimated_viewership;
      const viewsPerDollar = totalViews / totalCost;
      
      return {
        show_name: show.show_name,
        total_production_cost_usd: totalCost,
        total_estimated_viewership: totalViews,
        views_per_dollar: viewsPerDollar
      };
    })
    .filter(show => show !== null)
    .sort((a, b) => sortDescending ? 
      b.views_per_dollar - a.views_per_dollar : 
      a.views_per_dollar - b.views_per_dollar
    );
  
  return result;
}
const mergedData = calculateViewsPerDollar(cost, viewership);

The answer included plot code, which we can try out as well:

Join in the channel function

Instead of making a new dataset for this example, we could also just join the datasets within the channel function and calculate the metric we are looking for on the fly. In the below example, we find the shows cost data while in the viewership data, and calculate viewership / cost when determining the y value:

Plot.plot({
  marginLeft: 150,
   color: {
    scheme: "YlOrRd",
    legend: true,
    label: "Viewership",
  },
  x: {
    label: "Views per Dollar"
  },
  marks: [
    Plot.barX(viewership, {
      y: "show_name", 
      x: show => {
        // find the cost row 
        const thisCost = cost.find(costRow => costRow.show_name === show.show_name)
        // calculate the metric
        return show["total_estimated_viewership"] / thisCost["total_production_cost_usd"] 
      },
      fill: "total_estimated_viewership",
      sort: { y: "-x" }
    })
  ]
})