Today we will present an introduction to paragraphs migrations in Drupal. The example consists of migrating paragraphs of one type, then connecting the migrated paragraphs to nodes. A separate image migration is included to demonstrate how they are different. At the end, we will talk about behavior that deletes paragraphs when the host entity is deleted. Let’s get started.
Getting the code
You can get the full code example at https://github.com/dinarcon/ud_migrations The module to enable is UD paragraphs migration introduction
whose machine name is ud_migrations_paragraph_intro
. It comes with three migrations: ud_migrations_paragraph_intro_paragraph
, ud_migrations_paragraph_intro_image
, and ud_migrations_paragraph_intro_node
. One content type, one paragraph type, and four fields will be created when the module is installed.
Note: Configuration placed in a module’s config/install
directory will be copied to Drupal’s active configuration. And if those files have a dependencies/enforced/module
key, the configuration will be removed when the listed modules are uninstalled. That is how the content type, the paragraph type, and the fields are automatically created and deleted.
You can get the Paragraph module is using composer: composer require drupal/paragraphs
. This will also download its dependency: the Entity Reference Revisions module. If your Drupal site is not composer-based, you can get the code for both modules manually.
Understanding the example set up
The example code creates one paragraph type named UD book paragraph (ud_book_paragraph
). It has two “Text (plain)” fields: Title (field_ud_book_paragraph_title
) and Author (field_ud_book_paragraph_author
). A new UD Paragraphs (ud_paragraphs
) content type is also created. This has two fields: Image (field_ud_image
) and Favorite book (field_ud_favorite_book
) containing references to images and book paragraphs imported in separate migrations. The words in parenthesis represent the machine names of the different elements.
The paragraph migration
Migrating into a paragraph type is very similar to migrating into a content type. You specify the source, process the fields making any required transformation, and set the destination entity and bundle. The following code snippet shows the source, process, and destination sections:
source:
plugin: embedded_data
data_rows:
- book_id: 'B10'
book_title: 'The definite guide to Drupal 7'
book_author: 'Benjamin Melançon et al.'
- book_id: 'B20'
book_title: 'Understanding Drupal Views'
book_author: 'Carlos Dinarte'
- book_id: 'B30'
book_title: 'Understanding Drupal Migrations'
book_author: 'Mauricio Dinarte'
ids:
book_id:
type: string
process:
field_ud_book_paragraph_title: book_title
field_ud_book_paragraph_author: book_author
destination:
plugin: 'entity_reference_revisions:paragraph'
default_bundle: ud_book_paragraph
The most important part of a paragraph migration is setting the destination plugin to entity_reference_revisions:paragraph
. This plugin is actually provided by the Entity Reference Revisions module. It is very important to note that paragraphs entities are revisioned. This means that when you want to create a reference to them, you need to provide two IDs: target_id
and target_revision_id
. Regular entity reference fields like files, images, and taxonomy terms only require the target_id
. This will be further explained with the node migration.
The other configuration that you can optionally set in the destination section is default_bundle
. The value will be the machine name of the paragraph type you are migrating into. You can do this when all the paragraphs for a particular migration definition file will be of the same type. If that is not the case, you can leave out the default_bundle
configuration and add a mapping for the type
entity property in the process section.
You can execute the paragraph migration with this command: drush migrate:import
. After running the migration, there is not much you can do to verify that it worked. Contrary to other entities, there is no user interface, available out of the box, that lists all paragraphs in the system. One way to verify if the migration worked is to manually create a View that shows paragraphs. Another way is to query the database directly. You can inspect the tables that store the paragraph fields’ data. In this example, the tables would be:
ud_migrations_paragraph_intro_paragraph
paragraph__field_ud_book_paragraph_author
for the current author.paragraph__field_ud_book_paragraph_title
for the current title.paragraph_r__8c3a9563ac
for all the author revisions.paragraph_r__3fa7e9863a
for all the title revisions.
Each of those tables contains information about the bundle (paragraph type), the entity id, the revision id, and the migrated field value. Table names are derived from the machine names of the fields. If they are too long, the field name will be hashed to produce a shorter table name. Having to query the database is not ideal. Unfortunately, the options available to check if a paragraph migration worked are limited at the moment.
The node migration
The node migration will serve as the host for both referenced entities: images and paragraphs. The image migration is very similar to the one explained in a previous article. This time, the focus will be the paragraph migration. Both of them are set as dependencies of the node migration, so they need to be executed in advance. The following snippet shows how the source, destinations, and dependencies are set:
source:
plugin: embedded_data
data_rows:
- unique_id: 1
name: 'Michele Metts'
photo_file: 'P01'
book_ref: 'B10'
- unique_id: 2
name: 'Benjamin Melançon'
photo_file: 'P02'
book_ref: 'B20'
- unique_id: 3
name: 'Stefan Freudenberg'
photo_file: 'P03'
book_ref: 'B30'
ids:
unique_id:
type: integer
destination:
plugin: 'entity:node'
default_bundle: ud_paragraphs
migration_dependencies:
required:
- ud_migrations_paragraph_intro_image
- ud_migrations_paragraph_intro_paragraph
optional: []
Note that photo_file
and book_ref
both contain the unique identifier of records in the image and paragraph migrations, respectively. These can be used with the migration_lookup
plugin to map the reference fields in the nodes to be migrated. ud_paragraphs
is the machine name of the target content type.
The mapping of the image reference field follows the same pattern than the one explained in the article on migration dependencies. Using the migration_lookup
plugin, you indicate which is the migration that should be searched for the images. You also specify which source column contains the unique identifiers that match those in the image migration. This operation will return a single value: the file ID (fid
) of the image. This value can be assigned to the target_id
subfield of field_ud_image
to establish the relationship. The following code snippet shows how to do it:
field_ud_image/target_id:
plugin: migration_lookup
migration: ud_migrations_paragraph_intro_image
source: photo_file
Paragraph field mappings
Before diving into the paragraph field mapping, let’s think about what needs to be done. Paragraphs are revisioned entities. To make a reference to them, you need two IDs: their entity id and their entity revision id. These two values need to be assigned to two subfields of the paragraph reference field: target_id
and target_revision_id
respectively. You have to come up with a process pipeline that complies with this requirement. There are many ways to do it, and the specifics will depend on your field configuration. In this example, the paragraph reference field allows an unlimited number of paragraphs to be associated, but only of one type: ud_book_paragraph
. Another thing to note is that even though the field allows you to add as many paragraphs as you want, the example migrates exactly one paragraph.
With those considerations in mind, the mapping of the paragraph field will be a two step process. First, use the migration_lookup
plugin to get a reference to the paragraph. Second, use the fetched values to set the paragraph reference subfields. The following code snippet shows how to do it:
pseudo_mbe_book_paragraph:
plugin: migration_lookup
migration: ud_migrations_paragraph_intro_paragraph
source: book_ref
field_ud_favorite_book:
plugin: sub_process
source:
- '@pseudo_mbe_book_paragraph'
process:
target_id: '0'
target_revision_id: '1'
The first step is a normal migration_lookup
procedure. The important difference is that instead of getting a single value, like with images, the paragraph lookup operation will return an array of two values. The format is like [3, 7]
where the 3
represents the entity id and the 7
represents the entity revision id of the paragraph. Note that the array keys are not named. To access those values, you would use the index of the elements starting with zero (0). This will be important later. The returned array is stored in the pseudo_mbe_book_paragraph
pseudofield.
The second step is to set the target_id
and target_revision_id
subfields. In this example, field_ud_favorite_book
is the machine name paragraph reference field. Remember that it is configured to accept an arbitrary number of paragraphs, and each will require passing an array of two elements. This means you need to process an array of arrays. To do that, you use the sub_process
plugin to iterate over an array of paragraph references. In this example, the structure to iterate over would be like this:
[
[3, 7]
]
Let’s dissect how to do the mapping of the paragraph reference field. The source
configuration of the sub_process
plugin contains an array of paragraph references. In the example, that array has a single element: the '@pseudo_mbe_book_paragraph'
pseudofield. The quotes (') and at sign (@) are required to reuse an element that appears before in the process pipeline. Then, in the process
configuration, you set the subfields for the paragraph reference field. It is worth noting that at this point you are iterating over a list of paragraph references, even if that list contains only one element. If you had more than one paragraph to migrate, whatever you defined in process
will apply to all of them.
The process
configuration is an array of subfield mappings. The left side of the assignment is the name of the subfield you want to set. The right side of the assignment is an array index of the paragraph reference being processed. Remember that this array does not have named-keys, so you use their numerical index to refer to them. The example sets the target_id
subfield to the element in the 0
index and the target_revision_id
subfield to the element in the one 1
index. Using the example data, this would be target_id: 3
and target_revision_id: 7
. The quotes around the numerical indexes are important. If not used, the migration will not find the indexes and the paragraphs will not be associated. The end result of this operation will be something like this:
'field_ud_favorite_book' => array (1) [
array (2) [
'target_id' => string (1) "3"
'target_revision_id' => string (1) "7"
]
]
There are three ways to run the migrations: manually, executing dependencies, and using tags. The following code snippet shows the three options:
# 1) Manually.
$ drush migrate:import ud_migrations_paragraph_intro_image
$ drush migrate:import ud_migrations_paragraph_intro_paragraph
$ drush migrate:import ud_migrations_paragraph_intro_node
# 2) Executing depenpencies.
$ drush migrate:import ud_migrations_paragraph_intro_node --execute-dependencies
# 3) Using tags.
$ drush migrate:import --tag='UD Paragraphs Intro'
And that is one way to map paragraph reference fields. In the end, all you have to do is set the target_id
and target_revision_id
subfields. The process pipeline that gets you to that point can vary depending on how your paragraphs are configured. The following is a non-exhaustive list of things to consider when migrating paragraphs:
- How many paragraphs types can be referenced?
- How many paragraphs instances are being migrated? Is this a multivalue field?
- Do paragraphs have translations?
- Do paragraphs have revisions?
Do migrated paragraphs disappear upon node rollback?
Paragraphs migrations are affected by a particular behavior of revisioned entities. If the host entity is deleted, and the paragraphs do not have translations, the whole paragraph gets deleted. That means that deleting a node will make the referenced paragraphs’ data to be removed. How does this affect your migration workflow? If the migration of the host entity is rollback, then the paragraphs will be removed, the migrate API will not know about it. In this example, if you run a migrate status command after rolling back the node migration, you will see that the paragraph migration indicated that there are no pending elements to process. The file migration for the images will report the same, but in that case, the images will remain on the system.
In any migration project, it is common that you do rollback operations to test new field mappings or fix errors. Thus, chances are very high that you will stumble upon this behavior. Thanks to Damien McKenna for helping me understand this behavior and tracking it to the rollback() method of the EntityReferenceRevisions
destination plugin. So, what do you do to recover the deleted paragraphs? You have to rollback both migrations: node and paragraph. And then, you have to import the two again. The following snippet shows how to do it:
# 1) Rollback both migrations.
$ drush migrate:rollback ud_migrations_paragraph_intro_node
$ drush migrate:rollback ud_migrations_paragraph_intro_paragraph
# 2) Import both migrations againg.
$ drush migrate:import ud_migrations_paragraph_intro_paragraph
$ drush migrate:import ud_migrations_paragraph_intro_node
What did you learn in today’s blog post? Have you migrated paragraphs before? If so, what challenges have you found? Did you know paragraph reference fields require two subfields to be set? Did you that deleting the host entity also deletes referenced paragraphs? Please share your answers in the comments. Also, I would be grateful if you shared this blog post with others.
Next: Migrating CSV files into Drupal
This blog post series, cross-posted at UnderstandDrupal.com as well as here on Agaric.coop, is made possible thanks to these generous sponsors: Drupalize.me by Osio Labs has online tutorials about migrations, among other topics, and Agaric provides migration trainings, among other services. Contact Understand Drupal if your organization would like to support this documentation project, whether it is the migration series or other topics.
Sign up to be notified when Agaric gives a migration training:
Comments
2020 June 02
Renaud
Hello Maurico, paragraphs…
Hello Maurico, paragraphs are everywhere these days so this tutorial is very useful. 2 questions if I may.
First I'm a bit confused by the output of
drush migrate:import --tag='UD Paragraphs Intro'
.I did not expect to see «3 created» for udm_constants_pseudofields, udm_process_intro and udm_subfields since their respective migrations don't have a
migration_tags
key. Can you explain?Second, what if the paragraphs have translations, how would that change the process pipeline?
2020 June 02
Renaud
By the way there is no link…
By the way there is no link to the next tutorial :)
https://agaric.coop/blog/migrating-csv-files-drupal
2020 June 03
Renaud
SOLVED - The issue with the…
SOLVED - The issue with the output of
drush migrate:import --tag='UD Paragraphs Intro'
was due to migrate_run. I was using 8.x-1.0-beta3. The 8.x-1.x-dev version fixes this issue.2020 June 17
Renaud
I'm facing a challenge here…
I'm facing a challenge here. The module assumes that everyone has a favorite book. But what if someone has no favorite book or has not yet added a favorite book to their profile?
For example the following will generate an error:
- unique_id: 3
name: 'Stefan Freudenberg'
photo_file: 'P03'
book_ref: ''
I've tried modifying the process for field_ud_favorite_book using
Is this doable?plugin: skip_on_empty
without success.2020 June 17
Mauricio Dinarte
Hi Renaud, The are different…
Hi Renaud,
The are different ways to accommodate for an empty reference in the source and it all depends on the field configuration and the process plugins you use. Below is an example using a plugin provided by the migrate_process_array module:
The key thing is that the sub_process plugin expects and array of arrays as input. If you were to use core's
skip_on_empty
on an empty array [NULL] it would still pass and that input not appropriate for sub_process. Theskip_on_empty_array
provided by the migrate_process_array module properly accounts for this.Here is another process pipeline using only core plugins:
In this case we are taking advantage of the fact that this example only imports one paragraph despite the field accepting an unlimited number of values. As such, we can resort to setting the subfields directly for the one item (first delta) of the
field_ud_favorite_book
field usingskip_on_empty
and extracting the first (index 0) and second (index 1) elements if the emptiness test passes.When the process pipeline gets this complicated or hard to understand at first glance, writing a custom process plugin is recommended.
2020 June 17
Mauricio Dinarte
Hi Renaud, I tried to…
Hi Renaud,
I tried to reproduce the unexpected output results, but I was not able to do it. Good that you found a solution already.
About multilingual paragraphs, that is more complex. You would have to migrate the langcode property for the node and paragraph entities and take that into account when mapping the entity reference revisions field. The closest example I can think of in core are the "complete node migration" as documented at https://www.drupal.org/node/3105503 They uses the D7NodeTranslation class to create follow up migrations for entity reference field translations. This creates translations from d7_entity_reference_translation.yml (for Drupal 7) which makes use of the EntityReferenceTranslationDeriver deriver. The one thing to note is that these are entity reference fields, while paragraphs use entity reference *revisions* fields. So, it would have to be adjusted. In any case, the work on this issue is a great reference for paragraphs migrations in general https://www.drupal.org/project/paragraphs/issues/2911244
2020 July 05
Renaud
Hello Mauricio, Thanks for…
Hello Mauricio,
Thanks for your help, much appreciated. I was able to handle «empty reference» with the suggested
plugin: skip_on_empty_array
.Note - that this plugin is provided by the migrate_process_skip module (not migrate_process_array).
Cheers!
2021 September 15
Mike C
Hi Mauricio Thanks for the…
Hi Mauricio
Thanks for the great artilce. Is it possible you could post how you do this:
"If that is not the case, you can leave out the
default_bundle
configuration and add a mapping for thetype
entity property in the process section."I guess you put something in the process section of the yml file before the fields? I've been trying to add keys with prepareRow but without default_bundle set, it crashes.
Thanks
Mike
Add new comment