Allow arbitrary logic to run within projections

The new projection feature in v10 is great. But transforming data during projection could be so much more convenient if we could apply udfs or lambdas at arbitrary parts of the projection.

Sometimes I need to do some data transformation on a property. If the property I need to convert is several nestings deep then i can’t really use projection to its full potential.

Current:

let stats = data.stats.map(stat=>(
  let entry = stat.entry
  let rooms = Rooms.byEntry(entry).map(room=>room {
    id,
    name,
    date: Date.fromString(room.dateAsString)
  })
  stat {
    entry: entry {
      id,
      rooms: rooms.paginate()
    }
  }
))

data {
  stats: stats
}

vs proposed feature

data {
  stats {
    entry {
      id,
      rooms: (Rooms.byEntry(.) {
        id,
        name,
        date: Date.fromString(.dateAsString)
      }).paginate() 
    }
  }
}

The way v10 works currently requires more code, and is harder to understand what is going on. It gets more confusing if you have a larger data structure with more transformations.

In the proposal above, Rooms.byEntry(.) is an index that gets the set of rooms for an entry. It would receive the current entry because I am passing in the . which is something I made up to represent the entire object/document of the current projection context.

I agree, and you have a great use case I am making sure to share with the team.

Let me clarify something about the dot notation, and I think offer a different idea, because the dot syntax is not a way this could work.

Dot notation

In general, the dot notation is equivalent to lamda short hand. For example, the following are equivalent

// dot syntax
let getName = .name

// lambda
let getName = _ => _.name

Folks familiar with languages like Scala will recognize behavior similar to using underscores. In Scala you might define the lambda like _.name

This is convenient for providing a lambda as an argument to functions like map.

Collection.all().map(.name)

// equivalent to
Collection.all().map(_ => _.name)

Dot notation has different semantics in a projection.

// alias a projected field
let x = { name: "Paul" }
x {
  first_name: .name
}

But as soon as you use dot notation inside of a function, or in another block, it is used the first way I described. This is necessary to disambiguate between the two cases.

In your proposal, how do you tell the difference between picking date from the Room, from providing the lambda _ => _.date? You may very well need to pass a lambda to some function and assign the result to a field in your projection.

There is also the case where you want to use a variable from a higher level in a place at a lower level.

Thinking out loud

Could we assume to declare variable with the name of the field that could be reused? Maybe that would be okay for object field (event), but more complicated for arrays (rooms).

data {
  stats {
    entry {
      id,
      rooms: Rooms.byEntry(entry).map(room => room {
        id,
        name,
        date: Date.fromString(room.dateAsString)
      })
    }
  }
}

That might work, but would there be issues with conflicting variables?

If we focus on the array case, can we add for comprehension/list comprehensions?

      rooms: for r <- Rooms.byEntry(entry) yield {
        id,
        name,
        date: Date.fromString(r.dateAsString)
      }

or python style

      rooms: r {
        id,
        name,
        date: Date.fromString(r.dateAsString)
      } for r in Rooms.byEntry(entry)

Thoughts about any of that?

When I chose . in the original example, I was imagining XPath semantics where . refers to the current node, .. is the parent, and maybe some other absolute/relative selection semantics too.

Those would be ‘nice to have’ features, but your examples would work for all of my usecases.