Neo4j Graph Data Science Certification exam! 题库1

Deleting nodes How can you delete a specific node in Neo4j using Cypher? Use the MATCH clause to find the node and then DELETE it Use the DELETE clause with the pattern you want to delete Use the MERGE clause to delete the node There is no way to delete a specific node in Neo4j

In Neo4j, you can delete a specific node using the Cypher query language. Here's how you can do it:

Use the MATCH clause to find the node and then DELETE it : This is the correct approach. You first need to match the node you want to delete, and then you can delete it using the DELETE clause.

Here is the syntax for deleting a node:

cypher
复制代码
MATCH (n:Label {property: 'value'})
DELETE n
Replace Label with the label of the node, property with the property name, and value with the value you want to match. If you want to delete a node without a specific label or property, you can simply use:

cypher
复制代码
MATCH (n)
WHERE id(n) = $nodeId
DELETE n
Here, $nodeId is a parameter that you would pass to the query with the specific node ID you want to delete.

Use the DELETE clause with the pattern you want to delete : This is essentially the same as the first option. The DELETE clause is used after matching the node(s) you want to delete.

Use the MERGE clause to delete the node : This is incorrect. The MERGE clause is used for creating or matching nodes and relationships, not for deleting them.

There is no way to delete a specific node in Neo4j : This is also incorrect. As shown above, you can delete specific nodes using the MATCH and DELETE clauses in Cypher.

Remember to always backup your data before running delete operations, as they are irreversible.

Relationships in Cypher In the following MATCH clause, which of the elements represents the relationship between two nodes? cypher Copy MATCH (a)-[b]->(c) (a) (a)- -[b]→ (c) →(c)

In the Cypher query language, relationships between nodes are represented by the lines with an arrow. The arrow indicates the direction of the relationship. In the given Cypher pattern:
复制代码
cypher
复制代码
MATCH (a)-[b]->(c)
The element that represents the relationship between two nodes is:
复制代码
plaintext
复制代码
-[b]→
Here's what each part of this pattern represents:

[b]: This is the variable that represents the relationship itself.

-: This indicates the start of the relationship.

→: This arrow indicates the direction of the relationship, from the node a to the node c.

So, the correct choice is:
复制代码
plaintext
复制代码
-[b]→

Data modeling Given the following domain question. What parts of type X are required to make product Y? Which of the following elements could you represent as nodes in your data model? Part Type Product X Y

In the context of data modeling for the domain question "What parts of type X are required to make product Y?", you can represent the following elements as nodes in your Neo4j graph data model:

Part: This could be a node representing individual components or elements that are used in the manufacturing or assembly process.

Type: This could be a node representing the category or classification of parts. For example, 'Type X' could be a specific type of part.

Product: This could be a node representing the final product that is manufactured or assembled, such as 'Product Y'.

In a graph database like Neo4j, these nodes can be connected by relationships to represent the composition of the products. For example, you might have a relationship like PART_OF that connects a Part node to a Product node, indicating that the part is a component of the product. Additionally, you could have a relationship like TYPE_OF connecting a Part node to a Type node, indicating the type of the part.

Here's a simple representation:

Part nodes: Individual parts.

Type nodes: Categories or classes of parts (e.g., Type X).

Product nodes: Final products (e.g., Product Y).

Relationships could be:

PART_OF: From Part to Product.

TYPE_OF: From Part to Type.

This model allows you to query the database to find out which parts of a certain type (e.g., Type X) are required to make a specific product (e.g., Product Y).

Red flags What are some things to look for as red flags to investigate in a Cypher query plan? AllNodesScan NodeByLabelScan with a lot of DB hits NodeByLabelScan with a few DB hits Placement of eager operators Reading property values early in the query plan.

AllNodesScan: This operator scans all nodes in the database, which can be very expensive in terms of performance, especially for large databases. It's a sign that the query is not leveraging indexes effectively.

NodeByLabelScan with a lot of DB hits: This operator scans nodes with a specific label. If it results in a high number of database hits, it suggests that the query is not using indexes well for that label, leading to a full scan of the nodes with that label.

Placement of eager operators: Eager operators force the query to hold all results in memory before proceeding to the next part of the query. If an eager operator is placed early in the query plan and the preceding operations produce a large number of results, it can lead to high memory usage and potential performance issues.

Reading property values early in the query plan: If property reads are happening early in the query plan and before filters are applied, it might mean that the database is reading more properties than necessary, which can be inefficient. Ideally, filters should be applied as early as possible to reduce the amount of data that needs to be processed.

NodeByLabelScan with a few DB hits: While this might not be a red flag per se, it's important to understand why there are so few hits. If the query is expecting more results, this could indicate a lack of data or an issue with how the query is formulated.

Cartesian product warnings: If the query plan shows a Cartesian product, it means there's a potential for an expensive cross join between two sets of results, which can be very inefficient.

High cost estimates: The cost estimate of a query plan is an indicator of how expensive the query is expected to be. A high cost estimate is a red flag that the query might be slow.

Lack of index usage: If the query plan does not show any index usage, it might indicate that the query could be improved by creating appropriate indexes.

Sort operators with large volumes: Sorting large volumes of data can be resource-intensive. If a sort operation is early in the plan and dealing with a large number of records, it could be a performance bottleneck.

Suboptimal use of filters: If filters are placed after operations that produce large result sets, it might mean that the query is not as efficient as it could be.

When you identify these red flags, you can take steps to optimize the query, such as adding or adjusting indexes, rewriting the query to use filters more effectively, or restructuring the query to reduce the amount of data processed.

Data modeling Given the following domain question. Which customers have purchased Product X? Which of the following could you represent as relationships in your data model? Customer Purchased Product X

Customer: This would be a node representing individual customers.

Product: This would be a node representing the products available for purchase.

Purchased : This would be a relationship type that connects a Customer node to a Product node, indicating that the customer has purchased that product.

In this model, each Customer node would have a PURCHASED relationship to a Product node for each product they have bought. If you want to specifically represent "Product X," you could either:

Use a label on the Product node to indicate that it is an instance of Product X, or

Have a property on the Product node that indicates the product's identifier or name, which you can query against.

Here's a simple representation:

Customer nodes: Individual customers.

Product nodes: Individual products, potentially with a label or property indicating they are "Product X."

Purchased relationship: Connects Customer to Product, indicating a purchase.

A Cypher query to find customers who have purchased Product X might look like this:
复制代码
cypher
复制代码
MATCH (c:Customer)-[:PURCHASED]->(p:Product {name: 'Product X'})
RETURN c
In this query, :Customer is the label for customer nodes, :PURCHASED is the relationship type, and p:Product {name: 'Product X'} is the product node with a property name that matches 'Product X'.

Schemas in Neo4j Which of the following statements is correct? Neo4j does not support schemas Neo4j allows for an optional schema Neo4j enforces a strict schema

The correct statement is:

Neo4j allows for an optional schema

Neo4j is a graph database that is highly flexible and does not enforce a strict schema. Instead, it allows for an optional schema, which means that you can define constraints and indexes to enforce certain aspects of your data model if you choose to do so. This flexibility allows you to start with a very flexible schema and add constraints as your application evolves and your data model becomes more defined. It also enables you to take advantage of the performance benefits that come from having a well-defined schema without being locked into a rigid structure from the outset.

Returning Data Complete the following Cypher statement to return the name property of the Person node. cypher MATCH (:Movie{title: 'The Matrix'})<-[:ACTED_IN]-(p:Person) RETURN AS name

复制代码
cypher
复制代码
MATCH (:Movie{title: 'The Matrix'})<-[:ACTED_IN]-(p:Person)
RETURN p.name AS name
This query does the following:

MATCH clause: Finds the pattern where there is a Movie node with the title 'The Matrix' and a Person node that is connected to it via an ACTED_IN relationship.

RETURN clause: Retrieves the name property from the Person node and aliases it as name in the result set.

Multiple Match Clauses Why would Tom Hanks appear in this Cypher statement? cypher Copy MATCH (p:Person)-[:ACTED_IN]->(m) MATCH (m)<-[:ACTED_IN]-(p2) RETURN p2.name AS coactor But not in this one? cypher Copy MATCH (p:Person)-[:ACTED_IN]->(m), (m)<-[:ACTED_IN]-(p2) RETURN p2.name AS coactor It's a bug in Neo4j. You should report it to Neo4j Customer Support Relationships are unique within the scope of a single MATCH clause only The import process probably created two Tom Hanks nodes in the database It would never happen

The difference in the appearance of Tom Hanks in the two Cypher statements is due to the scope of the MATCH clauses and how they handle relationships.

In the first statement:
复制代码
cypher
复制代码
MATCH (p:Person)-[:ACTED_IN]->(m)
MATCH (m)<-[:ACTED_IN]-(p2)
RETURN p2.name AS coactor
Each MATCH clause is independent of the other. The first MATCH clause finds a Person node (let's say Tom Hanks) and the Movie node they acted in. The second MATCH clause then finds another Person node (a co-actor) who acted in the same Movie node. Since these are separate MATCH clauses, the variable m (representing the movie) is not implicitly available between the clauses. This means that for each movie found in the first match, the second match will look for all people who acted in that movie, potentially including Tom Hanks again if he acted in multiple movies.

In the second statement:
复制代码
cypher
复制代码
MATCH (p:Person)-[:ACTED_IN]->(m),
    (m)<-[:ACTED_IN]-(p2)
RETURN p2.name AS coactor
Both patterns are part of the same MATCH clause and are executed together. This means that the same Movie node m is used in both patterns. The variable m is scoped to the entire MATCH clause, so it will only find co-actors who acted in the same movie as the person p. This is more efficient and avoids the issue of a person being listed as their own co-actor.

The correct answer to why Tom Hanks might appear in the first statement but not the second is:

Relationships are unique within the scope of a single MATCH clause only.

This is not a bug in Neo4j; it's a feature of how Cypher handles multiple MATCH clauses versus a single MATCH clause with multiple patterns. The second query is more efficient and avoids redundant results, which is likely the intended behavior for finding co-actors.

Escaping Special Characters How can you escape special characters such as spaces in database, user and role names? With double quotes ("). With single quotes ('). With backticks (`). With backslashes (\).

In Cypher, which is the query language for Neo4j, you can escape special characters such as spaces in database, user, and role names with double quotes (" ").

Here's how you would use double quotes to escape a special character in a name:
复制代码
cypher
复制代码
CREATE USER "user name with space"
Single quotes (' ') are used to denote string literals in Cypher, not to escape identifiers like database, user, or role names.

Backticks (`````) are used in MySQL to escape table and column names, but they are not used in Cypher for this purpose.

Backslashes () are used to escape special characters in strings, but they are not used to escape identifiers in the context of database, user, and role names in Cypher.

So, the correct way to escape special characters such as spaces in such names is with double quotes (" ").

Variables in subqueries Complete the Cypher statement below to pass the p identifier to the subquery. cypher Copy MATCH (p:Person)-[:ACTED_IN]->()-[:IN_GENRE]->(:Genre {name: 'Comedy'}) CALL { SET p:FunnyActor } IN TRANSACTIONS OF 1000 ROWS

In Cypher, when you want to pass a variable to a subquery or a call to a procedure, you need to use the $ prefix to reference the variable within the subquery. Here's how you can complete the Cypher statement to pass the p identifier to the subquery:
复制代码
cypher
复制代码
MATCH (p:Person)-[:ACTED_IN]->()-[:IN_GENRE]->(:Genre {name: 'Comedy'})
CALL {
    WITH $p AS p
    SET p:FunnyActor
} IN TRANSACTIONS OF 1000 ROWS
YIELD $p
RETURN $p.name
In this query:

MATCH clause: Finds the pattern where a Person node is connected to a Genre node with the name 'Comedy' through an ACTED_IN and IN_GENRE relationship.

CALL block: Invokes a subquery where the p variable is passed using $p.

WITH $p AS p: Inside the subquery, the passed variable is aliased as p.

SET p:FunnyActor: Sets the label FunnyActor on the Person node.

IN TRANSACTIONS OF 1000 ROWS: Executes the subquery in batches of 1000 rows to avoid overloading the transaction with too many changes at once.

YIELD $p: Yields the p variable from the subquery, which includes the updated nodes.

RETURN $p.name: Returns the name property of the updated Person nodes.

Please note that as of my last update, Neo4j does not support the CALL syntax within a MATCH clause directly. The above example assumes a hypothetical or future syntax that allows for such a call. In current versions of Neo4j, you would typically perform updates within the MATCH clause itself without the need for a CALL block. If you need to perform batch updates, you would use the apoc.periodic.iterate procedure from the APOC library or a similar approach.

Adding additional labels You have a database consisting of (:Person) nodes connected to (:Movie) nodes through :ACTED_IN and :DIRECTED relationships? What is the benefit of adding additional (:Actor) labels for any (:Person) node with an :ACTED_IN relationship? It reduces the number of nodes in the graph It provides a faster way to find a subset of (:Person) nodes You can remove the :ACTED_IN relationships altogether There are no benefits

Adding additional labels to nodes in a Neo4j graph database can provide several benefits, especially when it comes to query performance and data modeling. In the context of your question, adding an (:Actor) label to (:Person) nodes that have an :ACTED_IN relationship would offer the following benefit:

It provides a faster way to find a subset of (:Person) nodes.

By adding an (:Actor) label to the (:Person) nodes that have acted in movies, you create a more specific subset of nodes that can be queried directly. This can lead to performance improvements because:

Indexing : You can create an index on the (:Actor) label, which makes it faster to find all actors in the database.

Query Simplicity : Queries can be more straightforward and targeted, as you can specifically ask for nodes with the (:Actor) label without having to filter through all (:Person) nodes.

Performance : When you have a large number of (:Person) nodes, some of whom are not actors, having a separate label for actors can reduce the workload for the database engine, as it doesn't have to consider non-actor persons when executing queries related to actors.

The other options you provided are not accurate:

It reduces the number of nodes in the graph: Adding a label does not reduce the number of nodes; it simply adds more information to existing nodes.

You can remove the :ACTED_IN relationships altogether : Removing the :ACTED_IN relationships would lose important data about which actors were involved in which movies, which is not desirable.

There are no benefits: As explained, there are clear benefits to adding additional labels for better query performance and data organization.

Bi-directional relationships In modeling, the use of bi-directional relationships is a good practice when: The semantics of the relationship in one direction is different from the other direction. You want to show the same relationship between two nodes in each direction. A relationship between two nodes could be represented in either direction. This is never a good idea.

In graph modeling, the use of bi-directional relationships is a good practice when:

The semantics of the relationship in one direction is different from the other direction.

This means that the relationship has different meanings or implications depending on the direction in which it is traversed. For example, if you have a FOLLOWS relationship between two users, A FOLLOWS B means that user A follows user B, which is not the same as B FOLLOWS A. In such cases, you would model two separate relationships to capture the different semantics.

The other options are not correct for the following reasons:

You want to show the same relationship between two nodes in each direction. This would imply symmetry, which is not typically how relationships are modeled in a graph database unless there is a specific semantic reason for the symmetry.

A relationship between two nodes could be represented in either direction. This suggests that the direction of the relationship does not matter, which is not usually the case. In graph databases, the direction of a relationship is important because it can change the meaning of the relationship.

This is never a good idea. This is too absolute and does not consider the cases where bi-directional relationships are appropriate, such as when the relationship has different meanings in each direction.

In summary, bi-directional relationships are used when the meaning of the relationship changes with

Intermediate nodes You have a graph consisting of (:Customer) nodes with :ORDERED relationships to (:Products). Instead of using a direct relationship between Customer and Product to represent an order, what is the benefit of extracting this relationship into an intermediate node Order? Which of the following optimizations involves enhancing the data model by extracting the 'ordered' relationship into a separate node? It allows us to only link products to customers. It enables linking the order to multiple entities, such as products and shipping companies. It simplifies the graph by reducing the number of nodes. It prevents the need for any relationships between Customer and Product.

Extracting the :ORDERED relationship into an intermediate node :Order provides several benefits, particularly in terms of data modeling and query flexibility. The optimization that involves enhancing the data model by extracting the 'ordered' relationship into a separate node is:

It enables linking the order to multiple entities, such as products and shipping companies.

By introducing an :Order node, you create a central node that can have relationships with multiple :Product nodes (representing the items ordered) and potentially with other entities like :ShippingCompany (if you want to model shipping information). This approach offers several advantages:

Modularity : The :Order node acts as a central point, making it easier to add or modify related information without altering the relationships between :Customer and :Product nodes directly.

Richness of Data : You can store additional information related to the order itself (like order date, status, total amount, etc.) in the :Order node, which would not be as straightforward with a direct relationship.

Flexibility: It allows for more complex queries that might involve understanding the context of the order, such as finding all orders that are shipped by a particular shipping company or that include a specific product.

The other options are not correct for the following reasons:

It allows us to only link products to customers. This statement is misleading because the introduction of an :Order node doesn't restrict the ability to link products to customers; it just adds an additional layer of detail.

It simplifies the graph by reducing the number of nodes. In fact, introducing an :Order node increases the number of nodes in the graph, but it does so to provide a more structured and informative model.

It prevents the need for any relationships between Customer and Product. This is not necessarily true; in many cases, you might still want to have direct relationships between :Customer and :Product nodes to represent other interactions or interests that are not related to orders.

Property combination index What type of constraint can you use to ensure that two or more properties are unique for any given label? Combination Constraint Unique Constraint Node Key Constraint NodePropertyConst

A Node Key Constraint is a type of constraint that allows you to specify that a combination of property values must be unique across all nodes with a given label. This is useful when you have a label and you want to ensure that no two nodes with that label have the same combination of values for certain properties.

Here's an example of how you might create a Node Key Constraint:
复制代码
cypher
复制代码
CREATE CONSTRAINT ON (n:Label) ASSERT n.property1 IS UNIQUE AND n.property2 IS UNIQUE;
In this example, property1 and property2 are the properties that must have a unique combination for all nodes with the label Label.

The other options provided are not correct:

Combination Constraint: This is not a recognized constraint type in Neo4j.

Unique Constraint: This constraint type ensures that a single property is unique across all nodes with a given label, not a combination of properties.

NodePropertyConst: This is not a recognized constraint type in Neo4j. It seems to be a misspelling or a confusion with the "Node Key Constraint."

Indexes How do you create a RANGE index on a relationship property? CREATE RANGE INDEX index_name FOR ()-[relType:RELATIONSHIP_TYPE]-() ON (relType.property_key) CREATE BTREE INDEX index_name FOR (n:NODE_LABEL) ON (n.property_key) CREATE FULLTEXT INDEX index_name FOR (n:NODE_LABEL) ON (n.property_key) CREATE TEXT INDEX index_name FOR ()-[relType:RELATIONSHIP_TYPE]-() ON (relType.property_key)

In Neo4j, you can create a RANGE index on a relationship property using the following syntax:
复制代码
cypher
复制代码
CREATE INDEX index_name FOR (relType:RELATIONSHIP_TYPE) ON (relType.property_key)
This statement creates a B-tree index, which is suitable for range queries, on the property_key of the RELATIONSHIP_TYPE relationship type.

The other options provided are incorrect for creating a RANGE index on a relationship property:

CREATE RANGE INDEX index_name FOR ()-[relType:RELATIONSHIP_TYPE]-() ON (relType.property_key) : This syntax is incorrect because Neo4j does not use the RANGE keyword when creating indexes.

CREATE BTREE INDEX index_name FOR (n:NODE_LABEL) ON (n.property_key): This creates a B-tree index on a node property, not a relationship property.

CREATE FULLTEXT INDEX index_name FOR (n:NODE_LABEL) ON (n.property_key): This creates a full-text index on a node property, which is used for full-text search, not for range queries on relationship properties.

CREATE TEXT INDEX index_name FOR ()-[relType:RELATIONSHIP_TYPE]-() ON (relType.property_key): This syntax is incorrect and not supported in Neo4j for creating indexes. Text indexes are also typically used for full-text search on node properties, not for indexing relationship properties.