Skip to main content

Blog

A woman pushing a giant rectangle with a right arrow within it.

Agaric is facilitating a full day training at DrupalCon Seattle to help you understand how to import content into your to Drupal 8 website.  It's the latest in our series of migration trainings.

This training is open for attendees with intermediate experience with Drupal– familiarity with installing a Drupal site and installing modules. We will use the Migrate API and related modules, which allows users to migrate content without writing any code.

With six instructors we will ensure no one gets left behind. Instead everyone will get the attention they need.

Attendees will learn to:

  • Import data from CSV and JSON files.
  • Transform data to populate taxonomy, date, image, file, and address fields.
  • Get content into Paragraphs.
  • Debug migration errors.

The training price is $450. Space is limited so we do encourage registering as soon as possible to ensure your spot.

As a web development cooperative that champions free software, we're passionate about migrations. It is a way to better understand Drupal's codebase, tap into the power of new features and build community. We have successfully migrated multiple sites to Drupal 8 and look forward to sharing our experience with you.

This training has sold out but please get in touch to be the first to know about future opportunities!

In the previous two blog posts, we learned to migrate data from JSON and XML files. We presented to configure the migrations to fetch remote files. In today's blog post, we will learn how to add HTTP request headers and authentication to the request. . For HTTP authentication, you need to choose among three options: Basic, Digest, and OAuth2. To provide this functionality, the Migrate API leverages the Guzzle HTTP Client library. Usage requirements and limitations will be presented. Let's begin.

Example config to add HTTP headers and authentication

Migrate Plus architecture for remote data fetching

The Migrate Plus module provides an extensible architecture for importing remote files. It makes use of different plugin types to fetch file, add HTTP authentication to the request, and parse the response. The following is an overview of the different plugins and how they work together to allow code and configuration reuse.

Source plugin

The url source plugin is at the core of the implementation. Its purpose is to retrieve data from a list of URLs. Ingrained in the system is the goal to separate the file fetching from the file parsing. The url plugin will delegate both tasks to other plugin types provided by Migrate Plus.

Data fetcher plugins

For file fetching, you have two options. A general-purpose file fetcher for getting files from the local file system or via stream wrappers. This plugin has been explained in detail on the posts about JSON and XML migrations. Because it supports stream wrapper, this plugin is very useful to fetch files from different locations and over different protocols. But it has two major downsides. First, it does not allow setting custom HTTP headers nor authentication parameters. Second, this fetcher is completely ignored if used with the xml or soap data parser (see below).

The second fetcher plugin is http. Under the hood, it uses the Guzzle HTTP Client library. This plugin allows you to define a headers configuration. You can set it to a list of HTTP headers to send along with the request. It also allows you to use authentication plugins (see below). The downside is that you cannot use stream wrappers. Only protocols supported by curl can be used: http, https, ftp, ftps, sftp, etc.

Data parsers plugins

Data parsers are responsible for processing the files considering their type: JSON, XML, or SOAP. These plugins let you select a subtree within the file hierarchy that contains the elements to be imported. Each record might contain more data than what you need for the migration. So, you make a second selection to manually indicate which elements will be made available to the migration. Migrate plus provides four data parses, but only two use the data fetcher plugins. Here is a summary:

  • json can use any of the data fetchers. Offers an extra configuration option called include_raw_data. When set to true, in addition to all the fields manually defined, a new one is attached to the source with the name raw. This contains a copy of the full object currently being processed.
  • simple_xml can use any data fetcher. It uses the SimpleXML class.
  • xml does not use any of the data fetchers. It uses the XMLReader class to directly fetch the file. Therefore, it is not possible to set HTTP headers or authentication.
  • soap does not use any data fetcher. It uses the SoapClient class to directly fetch the file. Therefore, it is not possible to set HTTP headers or authentication.

The difference between xml and simple_xml were presented in the previous article.

Authentication plugins

These plugins add authentication headers to the request. If correct, you could fetch data from protected resources. They work exclusively with the http data fetcher. Therefore, you can use them only with json and simple_xml data parsers. To do that, you set an authentication configuration whose value can be one of the following:

  • basic for HTTP Basic authentication.
  • digest for HTTP Digest authentication.
  • oauth2 for OAuth2 authentication over HTTP.

Below are examples for JSON and XML imports with HTTP headers and authentication configured. The code snippets do not contain real migrations. You can also find them in the ud_migrations_http_headers_authentication directory of the demo repository https://github.com/dinarcon/ud_migrations.

Important: The examples are shown for reference only. Do not store any sensitive data in plain text or commit it to the repository.

JSON and XML Drupal migrations with HTTP request headers and Basic authentication.

source:
  plugin: url
  data_fetcher_plugin: http
  # Choose one data parser.
  data_parser_plugin: json|simple_xml

  urls:
    - https://understanddrupal.com/files/data.json

  item_selector: /data/udm_root

  # This configuration is provided by the http data fetcher plugin.
  # Do not disclose any sensitive information in the headers.
  headers:
    Accept-Encoding: 'gzip, deflate, br'
    Accept-Language: 'en-US,en;q=0.5'
    Custom-Key: 'understand'
    Arbitrary-Header: 'drupal'

  # This configuration is provided by the basic authentication plugin.
  # Credentials should never be saved in plain text nor committed to the repo.
  authentication:
    plugin: basic
    username: totally
    password: insecure

  fields:
    - name: src_unique_id
      label: 'Unique ID'
      selector: unique_id
    - name: src_title
      label: 'Title'
      selector: title
  ids:
    src_unique_id:
      type: integer
process:
  title: src_title
destination:
  plugin: 'entity:node'
  default_bundle: page

JSON and XML Drupal migrations with HTTP request headers and Digest authentication.

source:
  plugin: url
  data_fetcher_plugin: http
  # Choose one data parser.
  data_parser_plugin: json|simple_xml

  urls:
    - https://understanddrupal.com/files/data.json

  item_selector: /data/udm_root

  # This configuration is provided by the http data fetcher plugin.
  # Do not disclose any sensitive information in the headers.
  headers:
    Accept: 'application/json; charset=utf-8'
    Accept-Encoding: 'gzip, deflate, br'
    Accept-Language: 'en-US,en;q=0.5'
    Custom-Key: 'understand'
    Arbitrary-Header: 'drupal'

  # This configuration is provided by the digest authentication plugin.
  # Credentials should never be saved in plain text nor committed to the repo.
  authentication:
    plugin: digest
    username: totally
    password: insecure

  fields:
    - name: src_unique_id
      label: 'Unique ID'
      selector: unique_id
    - name: src_title
      label: 'Title'
      selector: title
  ids:
    src_unique_id:
      type: integer
process:
  title: src_title
destination:
  plugin: 'entity:node'
  default_bundle: page

JSON and XML Drupal migrations with HTTP request headers and OAuth2 authentication.

source:
  plugin: url
  data_fetcher_plugin: http
  # Choose one data parser.
  data_parser_plugin: json|simple_xml

  urls:
    - https://understanddrupal.com/files/data.json

  item_selector: /data/udm_root

  # This configuration is provided by the http data fetcher plugin.
  # Do not disclose any sensitive information in the headers.
  headers:
    Accept: 'application/json; charset=utf-8'
    Accept-Encoding: 'gzip, deflate, br'
    Accept-Language: 'en-US,en;q=0.5'
    Custom-Key: 'understand'
    Arbitrary-Header: 'drupal'

  # This configuration is provided by the oauth2 authentication plugin.
  # Credentials should never be saved in plain text nor committed to the repo.
  authentication:
    plugin: oauth2
    grant_type: client_credentials
    base_uri: https://understanddrupal.com
    token_url: /oauth2/token
    client_id: some_client_id
    client_secret: totally_insecure_secret

  fields:
    - name: src_unique_id
      label: 'Unique ID'
      selector: unique_id
    - name: src_title
      label: 'Title'
      selector: title
  ids:
    src_unique_id:
      type: integer
process:
  title: src_title
destination:
  plugin: 'entity:node'
  default_bundle: page

To use OAuth2 authentication, you need to install the `sainsburys/guzzle-oauth2-plugin` package as suggested in Migrate Plus’ `composer.json` file. You can do it via Composer issuing the following command: `composer require sainsburys/guzzle-oauth2-plugin`. Otherwise, you would get an error similar to the following:


[error]  Error: Class 'Sainsburys\Guzzle\Oauth2\GrantType\ClientCredentials'
not found in Drupal\migrate_plus\Plugin\migrate_plus\authentication\OAuth2->getAuthenticationOptions()
(line 46 of /var/www/drupalvm/drupal/web/modules/contrib/migrate_plus/src/Plugin/migrate_plus/authentication/OAuth2.php)
#0 /var/www/drupalvm/drupal/web/modules/contrib/migrate_plus/src/Plugin/migrate_plus/data_fetcher/Http.php(100):
Drupal\migrate_plus\Plugin\migrate_plus\authentication\OAuth2->getAuthenticationOptions()

What did you learn in today’s blog post? Did you know the configuration names for adding HTTP request headers and authentication to your JSON and XML requests? Did you know that this was limited to the parsers that make use of the http fetcher? Please share your answers in the comments. Also, I would be grateful if you shared this blog post with others.

Next: Migrating Google Sheets into Drupal

This blog post series, cross-posted at UnderstandDrupal.com as well as here on Agaric.coop, is made possible thanks to these generous sponsors: Drupalize.me by Osio Labs has online tutorials about migrations, among other topics, and Agaric provides migration trainings, among other services.  Contact Understand Drupal if your organization would like to support this documentation project, whether it is the migration series or other topics.

The Talking Drupal Drupal logo - the brand name under a chat bubble containing a water droplet

Mauricio was interviewed on Talking Drupal to discuss his 31 Days of Migrations tutorial blog series on Drupal Migrations.

 

 

CC0
To the extent possible under law, Clayton Dewey has waived all copyright and related or neighboring rights to Build and Manage Online Donations in Drupal with the Give Module. This work is published from: United States.

In other words, please reuse and remix this article as you see fit! No attribution is required, it just needs to continue to stay in the Public Domain.

While working on making a module compatible with Drupal 9 I found that the module was using an obsolete function that had been replaced with a new service. It was something like this:


/**
 * My plugin.
 *
 * @SearchPlugin(
 *   id = "my_plugin",
 *   title = @Translation("My plugin")
 * )
 */
class MyPluginSearch extends SearchPluginBase implements AccessibleInterface, SearchIndexingInterface {

  /**
   * {@inheritdoc}
   */
  public static function create(
    ContainerInterface $container,
    array $configuration,
    $plugin_id,
    $plugin_definition
  ) {
    return new static(
      $configuration,
      $plugin_id,
      $plugin_definition
    );
  }

  /** ... **/
  public function indexClear() {
    search_index_clear($this->getPluginId());
  }
}

The function search_index_clear is now part of the new search.index service that was added in Drupal 8.8. In order to keep this working on Drupal 8.8+ and Drupal 9 we need to inject the service in the create function. But if we do this unconditionally, we will get an error in Drupal 8.7 because that service was added on 8.8. What to do then?

Fortunately years ago I read an article that addressed a similar need. It talked about how to safely extends Drupal 8 plugin classes without fear of constructor changes. In my case I didn't want to change the constructor, so as to keep it compatible with Drupal 8.7 and below. At the same time, I wanted to inject the new service to use it in Drupal 8.8+ and Drupal 9. I just modified a bit my code to something like this:

 


/**
 * My plugin.
 *
 * @SearchPlugin(
 *   id = "my_plugin",
 *   title = @Translation("My plugin")
 * )
 */
class MyPluginSearch extends SearchPluginBase implements AccessibleInterface, SearchIndexingInterface {

  /** ... */
  protected $searchIndex;

  /**
   * {@inheritdoc}
   */
  public static function create(
    ContainerInterface $container,
    array $configuration,
    $plugin_id,
    $plugin_definition
  ) {
    $instance =  new static(
      $configuration,
      $plugin_id,
      $plugin_definition
    );

    // Only inject the service in Drupal 8.8 or newer.
    if (floatval(\Drupal::VERSION) >= 8.8) {
      $instance->searchIndex = $container->get('search.index');
    }

    return $instance;
  }

  /** ... **/
  public function indexClear() {
    if (floatval(\Drupal::VERSION) >= 8.8) {
      $this->searchIndex->clear($this->getPluginId());
    }
    else {
      search_index_clear($this->getPluginId());
    }
  }
}

And that's it, Drupal 8.8 and newer will take advantage of the new service while we keep this compatible with Drupal 8.7. This will give users more time to upgrade to Drupal 8.8+ or Drupal 9.

Doctor talking with a mother and daughter.

NICHQ Collaboratory

Making breakthrough improvements so children and families live healthier lives

In the last blog post we were introduced to managing migration as configuration entities using Migrate Plus. Today, we will present some benefits and potential drawbacks of this approach. We will also show a recommended workflow for working with migration as configuration. Let’s get started.

Example workflow for managing migration configuration entities.

What is the benefit of managing migration as configurations?

At first sight, there does not seem to be a big difference between defining migrations as code or configuration. You can certainly do a lot without using Migrate Plus’ configuration entities. The series so far contains many examples of managing migrations as code. So, what are the benefits of adopting s configuration entities?

The configuration management system is one of the major features that was introduced in Drupal 8. It provides the ability to export all your site’s configuration to files. These files can be added to version control and deployed to different environments. The system has evolved a lot in the last few years, and many workflows and best practices have been established to manage configuration. On top of Drupal core’s incremental improvements, a big ecosystem has sprung in terms of contributed modules. When you manage migrations via configuration, you can leverage those tools and workflows.

Here are a few use cases of what is possible:

  • When migrations are managed in code, you need file system access to make any changes. Using configuration entities allows site administrators to customize or change the migration via the user interface. This is not about rewriting all the migrations. That should happen during development and never on production environments. But it is possible to tweak certain options. For example, administrators could change the location to the file that is going to be migrated, be it a local file or on a remote server.
  • When writing migrations, it is very likely that you will work on a subset of the data that will eventually be used to get content into the production environment.  Having migrations as configuration allow you to override part of the migration definition per environment. You could use the Configuration Split module to configure different source files or paths per environment. For example, you could link to a small sample of the data in development, a larger sample in staging, and the complete dataset in production.
  • It would be possible to provide extra configuration options via the user interface. In the article about adding HTTP authentication to fetch remote JSON and XML files, the credentials were hardcoded in the migration definition file. That is less than ideal and exposes sensitive information. An alternative would be to provide a configuration form in the administration interface for the credentials to be added. Then, the submitted values could be injected into the configuration for the migration. Again, you could make use of contrib modules like Configuration Split to make sure those credentials are never exported with the rest of your site’s configuration.
  • You could provide a user interface to upload migration source files. In fact, the Migrate source UI module does exactly this. It exposes an administration interface where you have a file field to upload a CSV file. In the same interface, you get a list of supported migrations in the system. This allows a site administrator to manually upload a file to run the migration against. Note: The module is supposed to work with JSON and XML migrations. It did not work during my tests. I opened this issue to follow up on this.

These are some examples, but many more possibilities are available. The point is that you have the whole configuration management ecosystem at your disposal. Do you have another example? Please share it in the comments.

Are there any drawbacks?

Managing configuration as configuration adds an extra layer of abstraction in the migration process. This adds a bit of complexity. For example:

  • Now you have to keep the uuid and id keys in sync. This might not seem like a big issue, but it is something to pay attention to.
  • When you work with migrations groups (explained in the next article), your migration definition could live in more file.
  • The configuration management system has its own restrictions and workflows that you need to follow, particularly for updates.
  • You need to be extra careful with your YAML syntax, especially if syncing configuration via the user interface. It is possible to import invalid configuration without getting an error. It is until the migration fails that you realize something is wrong.

Using configuration entities to define migrations certainly offers lots of benefits. But it requires being extra careful managing them.

Workflow for managing migrations as configuration entities

The configuration synchronization system has specific workflows to make changes to configuration entities. This imposes some restrictions in the way you make updates to the migration definitions. Explaining how to manage configuration could use another 31 days blog post series. ;-) For now, only a general overview will be presented. The general approach is similar to managing configuration as code. The main difference is what needs to be done for changes to the migration files to take effect.

You could use the “Configuration synchronization” administration interface at /admin/config/development/configuration. In it you have the option to export  or import a “full archive” containing all your site’s settings or a “single item” like a specific migration. This is one way to manage migrations as configuration entities which lets you find their UUIDs if not set initially. This approach can be followed by site administrators without requiring file system access. Nevertheless, it is less than ideal and error-prone. This is not the recommended way to manage migration configuration entities.

Another option is to use Drush or Drupal Console to synchronize your site’s configuration via the command line. Similarly to the user interface approach, you can export and import your full site configuration or only single elements. The recommendation is to do partial configuration imports so that only the migrations you are actively working on are updated.

Ideally, your site’s architecture is completed before the migration starts. In practice, you often work on the migration while other parts of the sites are being built. If you were to export and import the entire site’s configuration as you work on the migrations, you might inadvertently override unrelated pieces of configurations. For instance, this can lead to missing content types, changed field settings, and lots of frustration. That is why doing partial or single configuration imports is recommended. The following code snippet shows a basic Drupal workflow for managing migrations as configuration:

# 1) Run the migration.
$ drush migrate:import udm_config_json_source_node_local

# 2) Rollback migration because the expected results were not obtained.
$ drush migrate:rollback udm_config_json_source_node_local

# 3) Change the migration definition file in the "config/install" directory.

# 4a) Sync configuration by folder using Drush.
$ drush config:import --partial --source="modules/custom/ud_migrations/ud_migrations_config_json_source/config/install"

# 4b) Sync configuration by file using Drupal Console.
$ drupal config:import:single --file="modules/custom/ud_migrations/ud_migrations_config_json_source/config/install/migrate_plus.migration.udm_config_json_source_node_local.yml"

# 5) Run the migration again.
$ drush migrate:import udm_config_json_source_node_local

Note the use of the --partial and --source flags in the migration import command. Also, note that the path is relative to the current working directory from where the command is being issued. In this snippet, the value of the source flag is the directory holding your migrations. Be mindful if there are other non-migration related configurations in the same folder. If you need to be more granular, Drupal Console offers a command to import individual configuration files as shown in the previous snippet.

Note: Uninstalling and installing the module again will also apply any changes to your configuration. This might produce errors if the migration configuration entities are not removed automatically when the module is uninstalled. Read this article for details on how to do that.

What did you learn in today’s blog post? Did you know the know the benefits and trade-offs of managing migrations as configuration? Did you know what to do for changes in migration configuration entities to take effect? Share your answers in the comments. Also, I would be grateful if you share this blog post with others.

Next: Using migration groups to share configuration among Drupal migrations

This blog post series, cross-posted at UnderstandDrupal.com as well as here on Agaric.coop, is made possible thanks to these generous sponsors. Contact Understand Drupal if your organization would like to support this documentation project, whether it is the migration series or other topics.

It's great to be here, and there, and there.

Thanks Indieweb module for Drupal!

A screenshot shows the headline "Display Lists Naturally with the In Other Words Drupal Module", some whitespace, and then as a single line "By Clayton Dewey, Ben Melançon, and David Valdez".  Below that after another couple lines of whitespace one line of regular text beginning the article is visible.

Migration trainings

Overview

Learn to move content into Drupal 11 using the Migrate API. We will present an overview of the Extract-Transform-Load (ETL) pattern that Migrate implements. Source, process, and destination plugins will be explained to learn how each affect the migration process. By the end of the workshop, you will have a better understanding on how the migrate ecosystem works and the thought process required to plan and perform migrations. All examples will use YAML files to configure migrations. No PHP coding required.

There will be plenty of hands on examples to demonstrate different migrate concepts and how they can be used to import data into different types of fields. Time will also be allocated to answer attendee’s questions for topics not covered in the predefined material. 

Request a Private Training

Blog

Wendy is an Industrial Engineer turned web developer. She is an experienced WordPress engineer. Among her projects are online education platforms and multilingual sites. She volunteers at her local WordPress community and has presented at a couple of WordCamps.

Don't hesitate to get help with your migrations!

Agaric is also offering full-day trainings for these topics later this month. Dates, prices, more details, and registration options: