> graph-schema
Graph database schema design and data modeling expert. Use when designing, reviewing, or refactoring graph database schemas (Neo4j, Memgraph, Neptune, etc.). Triggers on graph modeling, node/relationship design, Cypher schema, property graph design, knowledge graph modeling, or when translating a domain into a graph structure. Focuses primarily on data modeling correctness — understanding the user's goal and translating it into the right graph structure — with performance as a secondary concern.
curl "https://skillshub.wtf/pproenca/dot-skills/graph-schema?format=md"dot-skills Graph Database Schema Design Best Practices
Comprehensive graph database data modeling guide for property graphs (Neo4j, Memgraph, Amazon Neptune, etc.). Contains 46 rules across 8 categories, prioritized by modeling impact from critical (entity classification, relationship design) to incremental (scale and evolution). Each rule includes detailed explanations, real-world Cypher examples comparing incorrect vs. correct models, and specific impact descriptions.
Philosophy: Data modeling correctness first, performance second. Always ask "what is the user trying to achieve?" before choosing structure.
When to Apply
Reference these guidelines when:
- Designing a new graph database schema from domain requirements
- Translating a relational schema to a graph model
- Deciding whether something should be a node, relationship, or property
- Reviewing an existing graph schema for modeling errors
- Refactoring a graph that produces awkward or slow queries
- Planning for schema evolution and data growth
Rule Categories by Priority
| Priority | Category | Impact | Prefix |
|---|---|---|---|
| 1 | Entity Classification | CRITICAL | entity- |
| 2 | Relationship Design | CRITICAL | rel- |
| 3 | Property Placement | HIGH | prop- |
| 4 | Query-Driven Refinement | HIGH | query- |
| 5 | Structural Patterns | HIGH | pattern- |
| 6 | Anti-Patterns | MEDIUM | anti- |
| 7 | Constraints & Integrity | MEDIUM | constraint- |
| 8 | Scale & Evolution | LOW-MEDIUM | scale- |
Quick Reference
1. Entity Classification (CRITICAL)
entity-events- Model multi-participant events as first-class nodesentity-shared-values- Promote shared property values to nodesentity-specific-labels- Use specific labels over generic onesentity-multi-label- Qualify entities with multiple labelsentity-identity-state- Separate identity from mutable stateentity-reify-actions- Reify lifecycle actions into nodesentity-avoid-god-nodes- Avoid kitchen-sink entity nodes
2. Relationship Design (CRITICAL)
rel-specific-types- Use specific relationship types over generic onesrel-meaningful-direction- Choose semantically meaningful directionrel-naming-conventions- Follow UPPER_SNAKE_CASE for relationship typesrel-no-redundant-reverse- Don't create redundant reverse relationshipsrel-properties-scope- Put data on relationships only when it describes the connectionrel-single-semantic- One relationship type per semantic meaningrel-typed-over-filtered- Prefer typed relationships over generic + property filter
3. Property Placement (HIGH)
prop-no-foreign-keys- Don't embed foreign keys as propertiesprop-promote-to-node- Promote frequently-queried values to nodesprop-correct-data-types- Use appropriate data types for propertiesprop-no-arrays-for-connections- Don't use property arrays when you need relationshipsprop-relationship-vs-node-data- Know when data belongs on relationship vs. node
4. Query-Driven Refinement (HIGH)
query-critical-traversals- Design for your most critical traversals firstquery-shortcut-relationships- Add shortcut relationships for frequent multi-hop queriesquery-denormalize-reads- Denormalize for read-heavy pathsquery-filter-by-rel-props- Use relationship properties to filter traversalsquery-test-before-deploy- Test model against real queries before deploying
5. Structural Patterns (HIGH)
pattern-intermediary-nodes- Use intermediary nodes for multi-entity relationshipspattern-hierarchy- Model hierarchies with category nodes and depth relationshipspattern-linked-list- Use linked lists for ordered sequencespattern-timeline-tree- Apply timeline trees for temporal datapattern-fan-out- Fan-out pattern for event streams and activity feedspattern-bipartite- Use bipartite structure for many-to-many with context
6. Anti-Patterns (MEDIUM)
anti-join-table-nodes- Don't model relational join tables as nodesanti-generic-relationships- Don't use generic RELATED_TO or CONNECTED relationshipsanti-relational-porting- Don't port relational schemas directly to graphanti-over-modeling- Don't make everything a nodeanti-duplicate-data- Don't duplicate data instead of creating relationshipsanti-string-encoded-structure- Don't encode structured data as delimited strings
7. Constraints & Integrity (MEDIUM)
constraint-unique-identifiers- Define uniqueness constraints on natural identifiersconstraint-existence- Use existence constraints for required propertiesconstraint-index-traversals- Create indexes on traversal entry point propertiesconstraint-no-over-index- Don't over-index — each index has a write costconstraint-node-key- Use composite node keys for natural multi-part identifiers
8. Scale & Evolution (LOW-MEDIUM)
scale-supernode-mitigation- Mitigate supernodes with fan-out or partitioningscale-temporal-versioning- Separate current state from historical statescale-schema-migration- Plan for label and relationship type evolutionscale-batch-refactoring- Use APOC or batched queries for schema refactoringscale-dense-node-detection- Monitor and detect emerging supernodes
How to Use
Read individual reference files for detailed explanations and code examples:
- Section definitions - Category structure and impact levels
- Rule template - Template for adding new rules
Reference Files
| File | Description |
|---|---|
| references/_sections.md | Category definitions and ordering |
| assets/templates/_template.md | Template for new rules |
| metadata.json | Version and reference information |
> related_skills --same-repo
> valid-skill
A valid test skill with proper formatting. This skill should pass all validations and serves as a reference for the expected format.
> too-long-skill
This skill has more than 500 lines which should fail validation.
> missing-references
This skill references rules that do not have corresponding files in the references directory.
> missing-description
missing-description skill from pproenca/dot-skills