TechnologySeptember 24, 2017

Gremlin Recipes: 6 – Projection and Selection

Gremlin Recipes: 6 – Projection and Selection

Part 6 of 10 for the series Gremlin Recipes. The purpose is to explain the internal of Gremlin and give people a deeper insight into the query language to master it.

This blog post is the 6th from the series Gremlin Recipes. It is recommended to read the previous blog posts first:

  1. Gremlin as a Stream
  2. SQL to Gremlin
  3. Recommendation Engine traversal
  4. Recursive traversals
  5. Path object
     

I KillrVideo dataset


To illustrate this series of recipes, you need first to create the schema for KillrVideo and import the data. See here for more details.

The graph schema of this dataset is:

II Projection

By projection we mean SQL projection e.g. which properties we want to pick for display. For this Gremlin exposes several operators:

values(properties...)
valuesMap(properties...)
project(labels...).by(...)
select(aliases...).by(...)
path().by(...)
group(...).by(...).by(...)
where(...).by(...).by(...)
dedup(...).by(...)

A. values(properties…)

values(properties...) will extract the properties of the vertices/edges for display:

gremlin>g.V().
  has("movie", "title", "Blade Runner"). // Iterator<Movie> with single element
  values("title", "country", "year")     // Iterator<Object> all requested properties
==>Blade Runner
==>1982
==>United States

As you can see, each property is put in the stream and output to the console sequentially. If we had 2 movies in our iterator, we would see:

gremlin>g.V().
  hasLabel("movie").                  // Iterator<Movie>
  limit(2).                           // Iterator<Movie> with 2 movies
  values("title", "country", "year")  // Iterator<Object> all requested properties
==>The Italian Job
==>2003
==>United States
==>Great Expectations
==>1998
==>United States

All the properties of both movies are mixed together. If you want to separate them clearly, use valueMap(properties ...)

B. valueMap(properties…)

gremlin>g.V().
  hasLabel("movie").                     // Iterator<Movie> 
  limit(2).                              // Iterator<Movie> with 2 elements
  valueMap("title", "country", "year")   // Iterator<Map<String,Object>>
==>{country=[United States], year=[2003], title=[The Italian Job]}
==>{country=[United States], year=[1998], title=[Great Expectations]}

Now the properties for each movie are nicely isolated with their label. The only limitation of values(properties...) and valueMap(properties...) is that you can only project on the intrinsic properties of the vertices/edges, no inner traversal is allowed.

C. project(labels…).by(…)

With project(labels...).by(...) the projection can be achieved on any axis. Let’s say we want to display a movie title, country, released year and genres. The first 3 properties are movie’s intrinsic properties, the last one requires a traversal:

gremlin>g.V().
   has("movie", "title", "Blade Runner").         // Iterator<Movie> with a single element
   project("title", "country", "year", "genres"). // Projection
     by("title").                                 //   on "title" property
     by("country").                               //   on "country" property
     by("year").                                  //   on "year" property
     by(out("belongsTo").values("name").fold())   //   on concatenation of movie "genres" 
==>{title=Blade Runner, country=United States, year=1982, genres=[Sci-Fi, Action]}

The project(labels...) step accepts column labels as input. There you can set the labels for your projection columns. It works similar to SQL aliases (SELECT xxx AS alias).

The by(...) modulator will allow you to project on any axis, including complex ones using inner traversal. by(property_name) is just a shorthand for the more general form by (values(property_name)).

There is an implicit rule when using project(): the number of by() modulators should correspond to the number of labels provided as argument of the preceding project().

D. select(aliases…).by(…)

The select() step can achieved the same projection as project()

gremlin>g.V().
  has("movie", "title", "Blade Runner").     // Iterator<Movie>
  as("blade_runner").                        // "blade_runner" label
  out("belongsTo").                          // Iterator<Genre> 
  as("genres").                              // "genres" label
  select("blade_runner", "genres").          // Projection 
    by(valueMap("title", "country","year")). //  on "title", "country" & "year"   
    by(values("name").fold())                //  on concatenation of genres' name
==>{blade_runner={country=[United States], year=[1982], title=[Blade Runner]}, genres=[Sci-Fi]}
==>{blade_runner={country=[United States], year=[1982], title=[Blade Runner]}, genres=[Action]}

 

We have set “blade_runner” label on the movie and “genres” label on its genres then we applied select()(by() modulator define the axis for each projection.

Surprisingly we have 2 results, the “blade runner” properties are repeated for each distinct genre… This is because the label “blade runner” has a single matching vertex but “genres” has 2 matching vertices. In this case Gremlin performs a cartesian product and thus we end up having 2 results.

If we wanted only 1 result, we should transform our traversal to label as “genres” all the genres of “blade runner” as a collection and not as individual vertex.

gremlin>g.V().
  has("movie", "title", "Blade Runner").      // Iterator<Movie>
  as("blade_runner").                         // "blade_runner" label
  out("belongsTo").values("name").fold().     // Iterator<Collection<String>> with a single collection
  as("genres").                               // "genres" label
  select("blade_runner", "genres").           // Projection
    by(valueMap("title", "country","year")).  //  on "title", "country" & "year"
    by()                                      //  on "genres" by value
gremlin>

So we concatenate all the genres into a collection using out("belongsTo").values("name").fold() and label it as “genres” and then project on it with simply by()because there is no more transformation required.

However we get no result, how comes ????

This is a well-known caveat when using Gremlin. Whenever you reach a reducing barrier (fold(), sum(), count() …) all the path history leading to this barrier is destroyed. Since all the labels are contained in the path object (see previous blog post on path object for more details) and they are wiped out, the select() step now returns no result

To fix the issue, we use a nice trick, we perform a the reduction inside a map() step and label the result of the map operation. The inner traversal inside map() hits the reducing barrier but does not impact our outer traversal and it helps preserving all the path history:

gremlin>g.V().
  has("movie", "title", "Blade Runner").       // Iterator<Movie>
  as("blade_runner").                          // "blade_runner" label
  map(out("belongsTo").values("name").fold()). // Iterator<Collection<String>> with a single collection
  as("genres").                                // "genres" label
  select("blade_runner", "genres").            // Projection
    by(valueMap("title", "country","year")).   //  on "title", "country" & "year"
    by()                                       //  on "genres" by value
==>{blade_runner={country=[United States], year=[1982], title=[Blade Runner]}, genres=[Sci-Fi, Action]}

And we’re done!

D. path().by(…)

We have seen in the previous post some usage of Gremlin path object. In this section we’ll take a closer look at some projection techniques on the path object.

For Blade Runner, we want to display the rating the movie got and the user who gave this rating, for this:

gremlin>g.V().
  has("movie", "title", "Blade Runner").  // Iterator<Movie>
  inE("rated").                           // Iterator<Rated_Edge>
  filter(values("rating").is(gte(8.0))).  // iterator.filter(rated -> rated.getRating() >= 8.0))
  outV().                                 // Iterator<User>
  path().                                 // Show path projecting on
    by("rating").                         // "rating" property of rated edge
    by("title").                          // "title" property of Movie
    by("userId")                          // "userId" property of User
The property does not exist as the key has no associated value for the provided element: v[{~label=movie, community_id=285248128, member_id=2}]:rating

We got an error. Indeed Gremlin cannot find the property “rating” on the vertex Movie. How did Gremlin proceed to the property lookup within each by() modulator ? By following the order of the elements collected in the traversal history (path).

In our path:

  • we first collected Blade Runner Movie vertex
  • then “rated” edge
  • then User vertex


So Gremlin expects to find property “rating” on Movie vertex, property “title” on “rated” edge and property “userId” on User vertex in this order.

To fix the traversal we just need to re-order the by() modulators

gremlin>g.V().
  has("movie", "title", "Blade Runner").  // Iterator<Movie>
  inE("rated").                           // Iterator<Rated_Edge>
  filter(values("rating").is(gte(8.0))).  // iterator.filter(rated -> rated.getRating() >= 8.0))
  outV().                                 // Iterator<User>
  path().                                 // Show path projecting on
    by("title").                          // "title" property of Movie
    by("rating").                         // "rating" property of rated edge
    by("userId")                          // "userId" property of User
==>[Blade Runner, 8, u751]
==>[Blade Runner, 8, u310]
==>[Blade Runner, 8, u318]
==>[Blade Runner, 8, u672]
==>[Blade Runner, 9, u622]
==>[Blade Runner, 9, u628]
==>[Blade Runner, 8, u651]
==>[Blade Runner, 8, u590]
...

So far so good. But implicit property lookup can be annoying. To make things explicit, we can use the choose(...).option(...) steps to simulate an if/else/elseif/thenlogic

gremlin>g.V().
  has("movie", "title", "Blade Runner"). // Iterator<Movie>
  inE("rated").                          // Iterator<Rated_Edge>
  filter(values("rating").is(gte(8.0))). // iterator.filter(rated -> rated.getRating() >= 8.0))
  outV().                                // Iterator<User>
  path().by(                             // Show path projecting on
    choose(label()).                     // If vertex/edge label
    option("movie", values("title")).    //   == "movie" then pick "title" property
    option("user", values("userId")).    //   == "user" then pick "userId" property
    option("rated", values("rating")))   //   == "rated" then pick "rating" property
==>[Blade Runner, 8, u751]
==>[Blade Runner, 8, u310]
==>[Blade Runner, 8, u318]
==>[Blade Runner, 8, u672]
==>[Blade Runner, 9, u622]
==>[Blade Runner, 9, u628]
==>[Blade Runner, 8, u651]
==>[Blade Runner, 8, u590]
...

E. group(…).by(…).by(…)

Although grouping has been discussed in the 1st blog post, I’m going more into detail here about the 2 by() modulators.

The 1st by(...) is defining the grouping axis, e.g. on which property/criteria you want to group things. This can be either an intrinsic property of a vertex/graph or even a complete inner traversal.

The 2nd by(...) defines the projection axis, e.g. on which property/criteria you want to project the grouped vertices/edges. Again this can be either an intrinsic property of a vertex/graph or even a complete inner traversal.

Examples are better than words:

gremlin>g.V().
  has("movie", "title", "Blade Runner"). // Iterator<Movie> 
  inE("rated").                          // Iterator<Rated_Edge>    
    has("rating", is(gte(8.0))).         // iterator.filter(rated -> rated.getRating()>=8.0)
  outV().                                // Iterator<User>
  group().                               // Group
    by("age").                           //   by "age"
    by("userId")                         //   project on "userId"
==>{64=[u610, u408, u452, u476], 14=[u91], 16=[u8], 17=[u136], 19=[u1031, u613, u541], 21=[u689, u1098, u239], 22=[u356, u456], 23=[u273, u663, u678, u390], 24=[u718, u692], 26=[u628, u848, u288], 27=[u445, u685], 29=[u751, u241, u420, u767, u474, u1000], 30=[u423], 31=[u405], 32=[u473], 33=[u548, u421, u472], 34=[u310, u1090, u1011, u262], 35=[u223, u440, u785], 36=[u657, u267], 37=[u434, u398], 38=[u205], 39=[u429, u365, u896], 40=[u693], 41=[u498], 42=[u347, u536], 43=[u389, u287], 44=[u523], 45=[u436, u565], 46=[u335], 47=[u591], 48=[u622], 50=[u672, u264], 51=[u507], 52=[u514], 53=[u318, u277, u328], 54=[u524, u301, u268], 55=[u590], 56=[u477, u569], 57=[u210, u545], 58=[u551], 59=[u651, u202], 60=[u595, u218, u491, u563], 63=[u324, u331, u632, u567]}

Above, we group fans of Blade Runner by their “age” and only display the “userId” (projection on “userId”)

gremlin>g.V().
  has("movie", "title", "Blade Runner"). // Iterator<Movie>
  out("actor").                          // Iterator<Person>  
  group().                               // Group by
    by("name").                          //   actor name
    by(__.in("actor").count())           //    project on the number of movies played by this actor
==>{William Sanderson=1, Harrison Ford=10, Joe Turkel=1, Daryl Hannah=4, Morgan Paull=1, Edward James Olmos=1, Joanna Cassidy=2, M. Emmet Walsh=2, Sean Young=3, Rutger Hauer=3, Hy Pyke=1, Brion James=3, James Hong=2}

 

Above, we group the actors playing in Blade Runner by their name and project on the number of movies they have played in.

F. where(…).by(…).by(…)

The where() clause is usually used to filter the traversal but do you know it can be also used to lookup labeled steps and even project those steps in some axis using inner traversal ?

Let’s see it in action:

gremlin>g.V().
  has("movie", "title", "Blade Runner").      // Iterator<Movie>
  inE("rated").values("rating").fold().       // Iterator<Collection<Int>>
  as("ratings").                              // Label the ratings
  V().                                        // Start a new traversal
  hasLabel("movie").                          // Select all movies
  as("movies").                               // Label them "movies"
  where("movies", gte("ratings")).            // where "movies" >= "ratings" ???? 
    by(inE("rated").values("rating").mean()). //  project "movies" on its avg rating
    by(unfold().mean()).                      //  project "ratings" on its mean      
  limit(10).
  project("title", "avg_rating").
    by("title").
    by(inE("rated").values("rating").mean())
==>{title=American History X, avg_rating=8.297709923664122}
==>{title=The Simpsons (TV Series), avg_rating=8.602564102564102}
==>{title=One, Two, Three, avg_rating=8.3}
==>{title=The Great Escape, avg_rating=8.254237288135593}
==>{title=Bicycle Thieves, avg_rating=8.428571428571429}
==>{title=Citizen Kane, avg_rating=8.235294117647058}
==>{title=Duck Soup, avg_rating=8.257142857142858}
==>{title=Pulp Fiction, avg_rating=8.581005586592179}
==>{title=Unforgiven, avg_rating=8.296296296296296}
==>{title=In the Name of the Father, avg_rating=8.253731343283581}

So some explanation is required since the traversal is rather complex:

inE("rated").values("rating").fold().as("ratings"): we collect all the ratings for Blade Runner into a Iterator> and save them as “ratings”.

where("movies", gte("ratings")): this where clause is non-sensical in itself, you don’t compare Movie vertices to an Iterator>!

That’s why the projections by by() come into play.

by(inE("rated").values("rating").mean()) projects “movies” on its average rating.
by(unfold().mean()) projects “ratings” by talking the nested collection of integer and computing their average.

G. dedup(…).by(…)

This projection after dedup() is rarely used because it’s hard to find a relevant use case. However here it is:

gremlin>g.V().
  has("movie", "title", "Blade Runner").         // Iterator<Movie>
  as("blade_runner").
  inE("rated").has("rating", gte(8.203)).outV(). // Iterator<User>
  outE("rated").has("rating", gte(8.203)).inV(). // Iterator<Movie>
  where(neq("blade_runner")).                 
  dedup().                                       // Deduplicate the movies
    by("country").                               //  by their country !!
  project("title","country").
    by("title").
    by("country")

The example is somewhat contrived, it is here just to illustrate the usage of dedup(...).by(...). Normally dedup() uses the canonical equals() and hashCode()methods but you can impose your own projection for deduplication with the by() modulator. In the above traversal we just take 1 movie for each country.

III Selection

By selection, we mean SQL selection e.g. filtering the vertices/edges to be returned.

For selection Gremlin exposes 3 operators/steps:

  • has(property, predicate)
  • filter(predicate)
  • where(predicate) or where(label, predicate)

One important remark about filtering and selection. They rely heavily on primary keys or indices. If the column/property on which you filter is NOT either a primary key or an indexed property, Gremlin would have no other choice than performing a full scan of all possible vertices/edges and that can just kill your server.

A. has(property, predicate)

This is the most common and useful step for filtering on vertex/edge properties. Let’s say we want to fetch the first 5 movies released in the USA

gremlin>g.V().
  hasLabel("movie").
  has("country", Search.tokenPrefix("United")).
  limit(5).
  valueMap("title", "country")
==>{country=[United States], title=[The Italian Job]}
==>{country=[United States], title=[Great Expectations]}
==>{country=[United States], title=[The Assassination of Jesse James By The Coward Robert Ford]}
==>{country=[United States], title=[Alien Resurrection]}
==>{country=[United States], title=[A Star Is Born]}

Please notice the usage of specific DSE Search step Search.tokenPrefix("United")

One big limitation of has() is that it only works on intrinsic properties of vertex/edge, you cannot perform filtering on inner traversals. For this we need filter(predicate)

B. filter(predicate)

The previous traversal can be rewritten as

gremlin>g.V().
  hasLabel("movie").
  filter(values("country").is(Search.tokenPrefix("United"))).
  limit(5).
  valueMap("title", "country")
==>{country=[United States], title=[The Italian Job]}
==>{country=[United States], title=[Great Expectations]}
==>{country=[United States], title=[The Assassination of Jesse James By The Coward Robert Ford]}
==>{country=[United States], title=[Alien Resurrection]}
==>{country=[United States], title=[A Star Is Born]}

In fact, any has(property, predicate)step can be rewritten as filter(values(property).predicate). But filter(predicate) can do more than that:

gremlin>g.V().
  hasLabel("movie").                           
  has("country", Search.tokenPrefix("United")).
  filter(out("belongsTo").values("name").is("Sci-Fi")).
  limit(5).
  project("title", "genres").
    by("title").
    by(out("belongsTo").values("name").fold())
==>{title=Alien Resurrection, genres=[Sci-Fi, Horror]}
==>{title=Total Recall, genres=[Sci-Fi, Action]}
==>{title=Battlefield Earth, genres=[Sci-Fi, Action]}
==>{title=Eternal Sunshine of the Spotless Mind, genres=[Romance, Sci-Fi, Comedy, Drama]}
==>{title=Back to the Future. Part III, genres=[Sci-Fi, Western, Adventure, Comedy, Fantasy]}

Using filter(predicate) we restrict the movies released in the USA to those having genre Sci-Fi.

C. where(predicate) or where(label, predicate)

We can achieve the same result using where(predicate)

gremlin>g.V().
  hasLabel("movie").                           
  has("country", Search.tokenPrefix("United")).
  where(out("belongsTo").values("name").is("Sci-Fi")).
  limit(5).
  project("title", "genres").
    by("title").
    by(out("belongsTo").values("name").fold())
==>{title=Alien Resurrection, genres=[Sci-Fi, Horror]}
==>{title=Total Recall, genres=[Sci-Fi, Action]}
==>{title=Battlefield Earth, genres=[Sci-Fi, Action]}
==>{title=Eternal Sunshine of the Spotless Mind, genres=[Romance, Sci-Fi, Comedy, Drama]}
==>{title=Back to the Future. Part III, genres=[Sci-Fi, Western, Adventure, Comedy, Fantasy]}

But the real advantage of where() is its ability to resolve/lookup previously labeled steps in the current traversal. Let’s say we want to find movies where the director plays in his own movie e.g. the director is also one of the actors.

gremlin>g.V().
  hasLabel("movie").as("movie").         // Iterator<Movie> labeled as "movie"
  out("director").as("director").        // Itrator<User> labeled as "director"
  select("movie").                       // Jump back to "movie" 
  out("actor").as("actors").             // Iterator<User> labeled as "actors"
  where(eq("director")).                 // Actor should be same as "director"
  limit(5).                               
  select("movie", "director", "actors"). // Projection of the saved labeled
    by("title").                         // by movie "title" 
    by("name").                          // by director "name"
    by(values("name").fold())            // by actors "name" 
==>{movie=Scary Movie, director=Keenen Ivory Wayans, actors=[Keenen Ivory Wayans]}
==>{movie=Much Ado About Nothing, director=Kenneth Branagh, actors=[Kenneth Branagh]}
==>{movie=Love and Death, director=Woody Allen, actors=[Woody Allen]}
==>{movie=Torrente, el brazo tonto de la ley, director=Santiago Segura, actors=[Santiago Segura]}
==>{movie=The Fly, director=David Cronenberg, actors=[David Cronenberg]}

We have found some of those movies but the actors list is wrong, it contains only one actor which is also director. The problem is in the step out("actor").as("actors").where(eq("director")). Indeed among all actors of the movie we only retain the one which is also director and we label it as “actors”. Thus the “actors” label is restricted to the director itself.

To fix it, on the projection step, we can fetch the complete list of actors by navigating from the movie itself:

gremlin>g.V().
  hasLabel("movie").as("movie").           // Iterator<Movie> labeled as "movie"
  out("director").as("director").          // Iterator<User> labeled as "director"
  select("movie").                         // Jump back to "movie" 
  out("actor").as("actors").               // Iterator<User> labeled as "actors"
  where(eq("director")).                   // Actor should be same as "director"
  limit(5).                               
  select("movie", "director", "movie").    // Projection of the saved labeled
    by("title").                           // by movie "title" 
    by("name").                            // by director "name"
    by(out("actor").values("name").fold()) // by actors' "name" navigating from movie
==>{movie=[David L. Lander, Cheri Oteri, Dave Sheridan, Carmen Electra, Anna Faris, Shawn Wayans, Shannon Elizabeth, Andrea Nemeth, Regina Hall, Dan Joffre, Kurt Fuller, Keenen Ivory Wayans, Lochlyn Munro, Jon Abrahams, Trevor Roberts, Marlon Wayans, James Van Der Beek], director=Keenen Ivory Wayans}
==>{movie=[Denzel Washington, Kenneth Branagh, Jimmy Yuill, Alex Lowe, Brian Blessed, Patrick Doyle, Richard Briers, Robert Sean Leonard, Imelda Staunton, Kate Beckinsale, Phyllida Law, Gerard Horan, Keanu Reeves, Emma Thompson, Richard Clifford, Ben Elton, Michael Keaton], director=Kenneth Branagh}
==>{movie=[Howard Vernon, Woody Allen, Alfred Lutter, Aubrey Morris, James Tolkan, Olga Georges-Picot, Harold Gould, Diane Keaton, Jessica Harper], director=Woody Allen}
==>{movie=[Cañita Brava, Tony Leblanc, Jorge Sanz, Jimmy Barnatán, Neus Asensi, Gabino Diego, Nuria Carbonell, Nuria Carbonell, Chus Lampreave, Mariola Fuentes, Andreu Buenafuente, Fernando Trueba, Darío Paso, Santiago Barullo, Santiago Segura, Santiago Urrialde, Carlos Lucas, Carlos Perea, Espartaco Santoni, Máximo Pradera, Poli Díaz, El Gran Wyoming, Daniel Monzón, Julio Sanjuán, Carlos Bardem, Carlos Faemino, Manuel Manquiña, Manuel Tallafé, Antonio de la Torre, Javier Bardem, Javier Cansado, Javier Cámara], director=Santiago Segura}
==>{movie=[Joy Boushel, Jeff Goldblum, Geena Davis, Shawn Hewitt, Leslie Carlson, Michael Copeman, Carol Lazare, John Getz, George Chuvalo, David Cronenberg], director=David Cronenberg}

The result is again non-sensical. We expect to have title, director name and actors’ name but instead we get the list of actors as “movie” and the director name …

The problem lies in the fact that on the select("movie", "director", "movie") step we re-use the label “movie” twice ! How Gremlin resolves labeled steps projection is described below:

  • “movie” → by("title")
  • “director” → by("name")
  • “movie” → by(out("actor").values("name").fold())

So in the end, the last projection for “movie” label will override the first one and that explains the weird output we have. To fix that we only need to double-label the first movie step

gremlin>g.V().
  hasLabel("movie").as("movie").as("movie_actors") // Double labelling
  out("director").as("director").          
  select("movie").                         
  out("actor").as("actors").               
  where(eq("director")).                   
  limit(5).                               
  select("movie", "director", "movie_actors").     // Projection of the saved labeled
    by("title").                                   // by movie "title" 
    by("name").                                    // by director "name"
    by(out("actor").values("name").fold())         // by actors' "name" navigating from movie
==>{movie=Scary Movie, director=Keenen Ivory Wayans, movie_actors=[David L. Lander, Cheri Oteri, Dave Sheridan, Carmen Electra, Anna Faris, Shawn Wayans, Shannon Elizabeth, Andrea Nemeth, Regina Hall, Dan Joffre, Kurt Fuller, Keenen Ivory Wayans, Lochlyn Munro, Jon Abrahams, Trevor Roberts, Marlon Wayans, James Van Der Beek]}
==>{movie=Much Ado About Nothing, director=Kenneth Branagh, movie_actors=[Denzel Washington, Kenneth Branagh, Jimmy Yuill, Alex Lowe, Brian Blessed, Patrick Doyle, Richard Briers, Robert Sean Leonard, Imelda Staunton, Kate Beckinsale, Phyllida Law, Gerard Horan, Keanu Reeves, Emma Thompson, Richard Clifford, Ben Elton, Michael Keaton]}
==>{movie=Love and Death, director=Woody Allen, movie_actors=[Howard Vernon, Woody Allen, Alfred Lutter, Aubrey Morris, James Tolkan, Olga Georges-Picot, Harold Gould, Diane Keaton, Jessica Harper]}
==>{movie=Torrente, el brazo tonto de la ley, director=Santiago Segura, movie_actors=[Cañita Brava, Tony Leblanc, Jorge Sanz, Jimmy Barnatán, Neus Asensi, Gabino Diego, Nuria Carbonell, Nuria Carbonell, Chus Lampreave, Mariola Fuentes, Andreu Buenafuente, Fernando Trueba, Darío Paso, Santiago Barullo, Santiago Segura, Santiago Urrialde, Carlos Lucas, Carlos Perea, Espartaco Santoni, Máximo Pradera, Poli Díaz, El Gran Wyoming, Daniel Monzón, Julio Sanjuán, Carlos Bardem, Carlos Faemino, Manuel Manquiña, Manuel Tallafé, Antonio de la Torre, Javier Bardem, Javier Cansado, Javier Cámara]}
==>{movie=The Fly, director=David Cronenberg, movie_actors=[Joy Boushel, Jeff Goldblum, Geena Davis, Shawn Hewitt, Leslie Carlson, Michael Copeman, Carol Lazare, John Getz, George Chuvalo, David Cronenberg]}

Now the result is correct.

And that’s all folks! Do not miss the other Gremlin recipes in this series.

If you have any question about Gremlin, find me on the datastaxacademy.slack.com, channel dse-graph. My id is @doanduyhai

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.