Lullabot Ideas

We know stuff. We empower you to know stuff too.

Quick-and-dirty CCK imports

Article by Jeff Eaton

Recently I needed to import a small (100-200) node set from an 'old' database to a CCK content type. I used the following code, which serves as a nice example of the drupal_execute() function.

<?php
$results
= db_query("SELECT * from recipe");
while (
$recipe = db_fetch_object($results)) {

 
// I was converting old style nodes to CCK nodes.
  // If you're building them from scratch you would
  // want to use: $node = array('type' => 'story'),
  // where 'story' is the type of node you want to
  // create.
 
$node = node_load($recipe->nid);

 
$values = array();

 
// You'll recognize this as the structure of
  // form_values you'll see in a submit or validate
  // handler. You're basically building it manually,
  // rather than using $_POST from a user.
 
$values['field_servings'][0]['value'] = $recipe->yield;
 
$values['field_prep_time'][0]['key'] = $recipe->preptime;

 
$i = 0;
 
$ingredients = db_query("SELECT ingredient FROM
    recipe_ingredients WHERE nid = %d ORDER BY weight ASC"
,
   
$recipe->nid);
  while (
$ing = db_fetch_object($ingredients)) {
   
$values['field_ingredients'][$i++]['value'] = $ing->ingredient;
  }

 
// Multivalue CCK fields are handled this way --
  // the name of the field, then a numerical key for
  // each individual field instance, then a 'value'
  // key. Single-value fields are actually handled
  // the same way, but they only have the '0' delta.
 
$values['field_directions'][0]['value'] = $recipe->instructions;
 
$values['body'] = $recipe->notes;
 
$values['field_source'][0]['value'] = $recipe->source;
 
$values['status'] = 1;
 
//$values['promote'] = 1;

  // 'recipe_node_form' would be 'story_node_form' or
  // whatever node type you're creating.
 
drupal_execute('recipe_node_form', $values, $node);
}
?>

Comments

An alternative method...

If you're on linux (and for webservers, why use anything else?) you could make a PHP script.

The two lines of the file should be:

#!/usr/bin/php
<?php

(assuming php is installed in /usr/bin)

Then, once you chmod the file to be executable by you (eg, chmod 700) you can just run the script as a command. If you put this script in your drupal install folder - you could do something like this...

<?php
exec
("/usr/bin/clear");
include_once(
'includes/bootstrap.inc');
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
bootstrap_invoke_all('init');
ini_set('memory_limit', '512M');
echo
"Bootstrap sucessfull\n\n";

//Authenticate as user 1
user_authenticate('admin', 'password');

/*
* Terms
*   These are hard coded mappings of old Wordpress Category ID's to Drupal Term ID's...
*/
$terms = array(
 
=> 195,
 
17 => 196,
 
=> 197,
 
=> 198,
 
11 => 199,
 
13 => 200,
 
16 => 201,
 
15 => 202,
 
20 => 203,
 
21 => 204,
 
=> 205,
 
=> 206,
 
19 => 207,
 
18 => 208,
 
=> 209,
 
14 => 210,
 
10 => 211,
 
12 => 212,
);


//Get some Wordpress posts...
$result = db_query("SELECT ID, post_date, post_title, post_content, guid from {wp_posts} WHERE post_status='publish'");
while(
$post = db_fetch_object($result)) {
  echo
t("Importing '%title'... ", array('%title' => $post->post_title));

 
//Term Data

 
$result2 = db_query("SELECT category_id from {wp_post2cat } WHERE post_id=%d", $post->ID);
 
$post_terms = array();
  while(
$t = db_fetch_object($result2)) {
    if(isset(
$terms[$t->category_id])) {
     
$post_terms[] = $terms[$t->category_id];
    }
  }

 
$node = new StdClass();
 
$node->title = $post->post_title;
 
$node->type = 'blog';
 
$node->body = $post->post_content;
 
$node->status = 1;
 
$node->format = 1;
 
$node->moderate = 0;
 
$node->promote = 0;
 
$node->sticky = 0;
 
$node->revision = 0;
 
$node->name = 'whatsnew';
 
$node->comment = 0;
 
$node->date = $post->post_date;
 
$node->taxonomy = $post_terms;

 
$node = node_submit($node);
 
node_save($node);

 
path_set_alias("node/" . $node->nid, substr($post->guid, 26, -1));

  echo
" Done\n";
}
?>

The only issue is if you use pathauto - the path_set_alias will add your custom URL as a second URL to the pathauto one. Quite easy to fix - but bloody annoying!! ;-)

I used this recently. Imported 160 blog posts into Drupal in about 3 seconds.

I tried running

I tried running drupal_execute() as a command line script using this method, but form.inc doesn't like it. Seems something's missing, any ideas?

I'm importing data into 30.000+ CCK nodes on a site, and doing it through a browser isn't much fun...

Not having much luck...

Hey, I'm not having much luck with the above code. In particular, it's the taxonomy portion that I can't get going. what verion of drupal is the above code written for? I can't seem to import stuff into the DB with taxonomy data. ....

drupal_execute

I'm trying to understand the purpose for drupal_execute() (and I'm sure there is one), but AFAIK it's simply a way to submit data to Drupal programmatically, simulating an actual form submission?

Why not simply use node_save()?

Look forward to get my head around that one...

Full form life cycle

I'm trying to understand the purpose for drupal_execute() (and I'm sure there is one), but AFAIK it's simply a way to submit data to Drupal programmatically, simulating an actual form submission?

Why not simply use node_save()?

Look forward to get my head around that one...

You're right -- it IS just a way to simulate a full form submission without any user input. There are several reasons to do things that way:

  1. Any stuff that was form_alter'd into the node submission form can be properly validated and so on
  2. If errors are encountered (validation, etc) you can check form_get_errors() to see what happened.
  3. drupal_execute() can be used for any form, not JUST node forms.

In general, drupal_execute, because it still goes through node_save() after all the other processing is done, is more robust when you have a bunch of modules installed that muck with the node edit form.

Makes perfect sense

So when one wants to shield data migration from other modules and/or processes, go straight to node_save...otherwise keep it extensible and use drupal_execute.

That will be handy in the future...so far I only had to deal with the first scenario.

Cheers

fair assessment

It's fair to say that if you wanted to block other modules out, using node_save is a way to achieve that.

Don't forget node validation though. That is also bypassed, and might cause errors that are hard to track.

Retreiving node nid

Thanks Jeff for this perfect article.

I have a question; Assume I have added a content using drupal_execute. How can I retrieve the nid of the node I have added? I am looking for a standard way. I can execute a query that gives me the last nid, but it does not look correct to me; because -maybe at the same time- another instance of the script has added another node; rare situation, but possible.

Thanks.

Insert hooks for getting nid

hook_insert should pass in a node with a node id. I needed to use the insert op under nodeapi ( http://api.drupal.org/api/function/hook_nodeapi/5) since the module was not registered for that cck type (the next step).

How about Importing a CCK image fields ?

Hi, How would this code be if we wanted to migrate some images alongside the posts.

importing images

Step 1: Make sure that the images are somewhere under the files/ directory

Step 2: Populate the files table using an sql INSERT query
fid => db_next_id('{files}_fid')
filename => use php function basename() for the image filename
filepath => relative path to the image on the drupal site
filemime => the mimetype, likely to be one of these: 'image/gif', 'image/jpeg', 'image/png'
filesize => use php function filesize() on the image file

Step 3: Load in the saved node object, and save the file id

$n = node_load($nid);
$n->field_myimage[0]['fid'] = $fid;
node_save($n);

Step 4: update the files table with the node id (nid)

Hi aufumy, " Step 4: update

Hi aufumy,

" Step 4: update the files table with the node id (nid) "

What is the significance of step:4 ?? In my case I could not find a nid field in the files table (I am using drupal 6).

I followed the steps described by you apart from the last point. Does this method works for Drupal-6 !!
One more thing I would like to know, if in this process we are just feeding the $n->field_myimage[0]['fid'] information . What about the $n->field_myimage[0]['data'] filed? How will we populate this in the database.

Please help.

- Ipsita

Thanks for the article! is

Thanks for the article! is there a counterpart drupal_execute in drupal 4.7??

How to create a menu item as well

Referring to the very first piece of code on this page, how can I go about creating a menu item for the node as well. Say I was driving the page type node form...

Exactly what I was looking for.

Albeit not importing from Wordpress, this is pretty much exactly what i've been looking for. I needed to imported flexinodes from an old DB into a new DB for a new website. I couldn't figure out how to do this up until this article. Thanks for the insight!

What about importing CCK Date fields

Hi

Great article- how would you modify to import CCK date fields.

DJ

Date fields

I have had several people reporting trouble getting this to work for date fields in the Date 5.2 version and after trying to get it to work myself I found out why.

Date 5.2 uses FAPI to send the form values to its validator by adding a #validate item to the form. The date module collects info in widgets that may not look like you would expect them to -- the select widget expects to be getting back an array with 'year', 'month', 'day', etc. The Date Popup widget expects to get back an array of 'date' and 'time'. And if you have a date with timezones, you need to provide the timezone name in the arrays. These values are validated and processed back into string or integer date values in the date validation function. If the validation function doesn't get the right info in the right format, it will fail, but to provide the right info you have to know exactly what is expected by the specific widget and field.

On top of that I ran into some sort of quirk in the D5 FAPI processing if you try to create a bunch of nodes at one time from an array of date values (see http://drupal.org/node/258572). When doing that I found that the validation functions don't get called at all after the first node is created, so all the subsequent nodes miss that processing and end up with the wrong values for the field and end up creating nodes with empty values. I suspect it is something about the way that the form is cached or the re-use of the token or something like that that makes the processor think no validation is needed, but I couldn't find a way to get drupal_execute to treat each new node as a completely fresh form. I'm sure there must be a way, I just couldn't find it.

Anyway, I've been telling people not to use drupal_execute to import dates and instead to use node_save(), which seems much simpler to use. I realize the reason for suggesting drupal_execute is so the values will get validated, but I'm not sure that is even reliably true, depending on how you're doing the import and how the validation is done. Using node_save means you need to do your own checking before you import the values that they are correct, but if you're importing values with a script, you should be able to check that the values are appropriate as a step in your processing.

The 5.1 version of the date module does not use #validate, so it doesn't have a validator choking on the wrong values. And the Date module may be the only D5 CCK field doing its own validation that way, so this may not be a problem in other fields.

Just ran into the problem KarenS describes with Date fields

I'm using Drupal 6 with the Date 6.x-2.0-rc6 release, and indeed, the second node I try to create throws an error in db_escape_string() because it's getting an array instead of the string from the data field. After that, all bets are off, and the nodes it creates are a mess. :-p

So whatever the bug or problem is, it's still around.

D6

Thanks Jeff for all the good articles.
For those using drupal 6:
You have to consider some extra things:

Since you are passing $form_state in drupal_execute you have add your fields to
$values['value']. So Jeff's example becomes:

$values['value']['field_servings'][0]['value'] = $recipe->yield;
$values['value']['field_prep_time'][0]['key'] = $recipe->preptime;
$values['value']['op'] = t('Save'); // is mandatory, otherwise the node won't be saved

other changes according api.d.org:

module_load_include('inc', 'node', 'node.pages');
drupal_execute('your-content-type_node_form', $form_state, (object)$node);

for more info please search the drupal forums.

D6: $values['values']

There is a typo in the code above. It should be $values['values']:

<?php
$values
['values']['field_servings'][0]['value'] = $recipe->yield;
?>

Not working on Drupal 6 :-(

Hello!

I'm trying to import on Drupal 6 a content type guestbook (German: gaestebuch) with some CCK fields like described above.

Unfortunatly I just get the following error message

Fatal error: Unsupported operand types in /var/www/xxxxx/html/includes/form.inc on line 511

I tried it also before in a different way with node_save, but this didn't save my CCK fields, just standard node fields.

All help is appreciated!

Below the script is attached.

Thanks in advance and best regards,

Chris

<?php
$execute_script
= true;
$debuglevel = 4; // 0 = no debug; 1 = error; 2 = warning; 3 = info; 4 = debug
$adminuser = 'admin';
$adminpass = 'XXXXXXXXXXXXXXXXXX';
$contenttype = 'gaestebuch_test';
$nodeform = 'gaestebuch_test_node_form';
$nodepromoted = 0;
$nodesticky = 0;

function
mymain() {
  global
$execute_script;
  global
$debuglevel;
  global
$adminuser;
  global
$adminpass;
  global
$contenttype;
  global
$nodepromoted;
  global
$nodesticky;

  if (
$execute_script){
   
   
// initiate script
   
init_script();
   
html_head();
    echo
"Bootstrap sucessfull<br><br>";
    print(
"debuglevel: ".$debuglevel."<br><br>");
   
// Test creation of node
   
$nodetitle = 'Test 02';
   
$nodebody = 'Test 02 script.';
   
$nodeuserid = 0;
   
$nodecreated = time();
   
$gaestebuch_name = 'Chris';
   
$gaestebuch_email = 'mail@domain.dom';
   
$gaestebuch_homepage = 'http://www.domain.dom/';
   
   
create_mynode ($nodetitle,$nodebody,$nodeuserid,$nodecreated, $gaestebuch_name,$gaestebuch_email,$gaestebuch_homepage );

   
html_foot();
  }

}

function
create_mynode ($nodetitle,$nodebody,$nodeuserid,$nodecreated, $gaestebuch_name,$gaestebuch_email,$gaestebuch_homepage ) {
  global
$nodeform;
  global
$execute_script;
  global
$debuglevel;
  global
$adminuser;
  global
$adminpass;
  global
$contenttype;
  global
$nodepromoted;
  global
$nodesticky;

 
 
 
//DEBUG Level Debug
 
if($debuglevel >= 4) {
    print(
"Parameters: nodetitle: ".$nodetitle." nodebody: ".$nodebody." nodeuserid: ".$nodeuserid." nodecreated: ".$nodecreated." gaestebuch_name: ". $gaestebuch_name." gaestebuch_email: ".$gaestebuch_email." gaestebuch_homepage: ".$gaestebuch_homepage."<br>");
  }


 
// new node
 
$node = new StdClass();

 
$values = array();
 
module_load_include('inc', 'node', 'node.pages');
 
$node = array('type' => $contenttype);    //Can be any content type you have
 
$values['values']['is_new'] = TRUE//If this is a new entry, add this.  Otherwise replace it with $node->nid
 
$values['values']['status'] = 1//Optional if this is an existing node
 
$values['values']['name'] = 'Gast'; //$nodeuserid;  //Optional if this is an existing node.
 
$values['values']['created'] = $nodecreated; //** Valid unix time stamp
 
$values['values']['changed'] = $nodecreated; //** Valid unix time stamp
 
$values['values']['promote'] = $nodepromoted; // promote to frontpage
 
$values['values']['sticky'] = $nodesticky;
 
$values['values']['language'] = 'de';
 
$values['values']['op'] = t('Save'); // is mandatory, otherwise the node won't be saved
 
$values['values']['title'] = $nodetitle;
 
$values['values']['teaser'] = $nodebody;
 
$values['values']['body'] = $nodebody;

 
// CCK Fields

  // field_gaestebuch_name
 
$values['values']['field_gaestebuch_name'][0]['value'] = $gaestebuch_name// CCK field for Name
  // field_gaestebuch_email
 
$values['values']['field_gaestebuch_email'][0]['value'] = $gaestebuch_email// CCK field for email
  // field_gaestebuch_homepage
  //  field_gaestebuch_homepage[0][title]
  //  field_gaestebuch_homepage[0][url]
 
$values['values']['field_gaestebuch_homepage'][0]['title'] = $gaestebuch_homepage// CCK field for Homepage Title
 
$values['values']['field_gaestebuch_homepage'][0]['url'] = $gaestebuch_homepage// CCK field for Hompage URL
 
$values['values']['field_gaestebuch_homepage'][0]['attributes'] = 'n';


 
//DEBUG Level Info
 
if($debuglevel >= 3) {
    print(
"<br>Creating new node with Title: ".$values['values']['title']." from user: ".$values['values']['uid']." Creation Date: ". unixtime2date($values['values']['created'])."<br>");
  }

 
// create the node
 
drupal_execute($nodeform, $values, (object)$node);

 
//DEBUG Level Info
 
if($debuglevel >= 3) {
    print(
"... done: new Node ID: ".$node->nid."<br>");
  }
 

}

function
init_script() {
  global
$adminuser;
  global
$adminpass;


  include_once(
'includes/bootstrap.inc');
 
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
 
bootstrap_invoke_all('init');
 
ini_set('memory_limit', '512M');
 

  require_once
'modules/node/node.pages.inc';

 
//Authenticate as user 1
 
user_authenticate($adminuser, $adminpass);
}


function
unixtime2date($unixtimestamp){
  return
date ('Y-m-d H:i', $unixtimestamp);
}

function
html_head() {
  global
$contenttype;
  print(
"<html><head><title>Import ".$contenttype."</title></head><body><h1>Import ".$contenttype."</h1>");

}

function
html_foot() {
  print(
"</body></html>");
}
mymain();
?>

got it working with node_save

Hello!

I got my script working.
The trick was to call three functions:

<?php
$node
= node_submit($node);
node_save($node);  //Actually save or edit the node now
content_insert($node);
?>

the complete funtktion:

<?php
function create_guestbook_node ($nodetitle,$nodebody,$nodeuserid,$nodeusername,$nodecreated, $gaestebuch_name,$gaestebuch_email,$gaestebuch_homepage ) {

   
$cfdebuglevel = 3; // 0 = no debug; 1 = error; 2 = warning; 3 = info; 4 = debug
   
$contenttype = 'gaestebuch_test';
   
$nodepromoted = 0;
   
$nodesticky = 0;
   
   
output("Contenttype: ".$contenttype);

 
output("debuglevel: ".$cfdebuglevel);
 
 
//user_authenticate($adminuser, $adminpass);

  // Construct the new node object.
 
$node = new stdClass();

 
$node->is_new = TRUE//If this is a new entry, add this.  Otherwise replace it with $node->nid
 
$node->type = $contenttype; //Can be any content type you have
 
node_object_prepare($node); // just filled in default values for uid, status, promote, status, date, created, and revision properties
  // Your script will probably pull this information from a database.
 
$node->status = 1//Optional if this is an existing node
 
$node->uid = $nodeuserid//Optional if this is an existing node.
 
$node->name = $nodeusername;
 
$node->created = $nodecreated; //** Valid unix time stamp
 
$node->date = $nodecreated;
 
$node->timestamp = $nodecreated;
 
$node->changed = $nodecreated;
 
$node->promote = $nodepromoted;
 
$node->sticky = $nodesticky;
 
$node->language = 'de';
 
$node->format = '1';
 
$node->pathauto_perform_alias = 1;
 
$node->title = $nodetitle//Optional if this is an existing node
 
$node->teaser = $nodebody;
 
$node->body = $nodebody;

 
// CCK Fields
  // field_gaestebuch_name
 
$node->field_gaestebuch_name[0]['value'] = $gaestebuch_name// CCK field for Name
  // field_gaestebuch_email
 
$node->field_gaestebuch_email[0]['email'] = $gaestebuch_email// CCK field for email
  // field_gaestebuch_homepage
  //  field_gaestebuch_homepage[0][title]
  //  field_gaestebuch_homepage[0][url]
 
$node->field_gaestebuch_homepage[0]['title'] = $gaestebuch_homepage// CCK field for Homepage Title
 
$node->field_gaestebuch_homepage[0]['url'] = $gaestebuch_homepage// CCK field for Hompage URL
 
$node->field_gaestebuch_homepage[0]['attributes'] = 'N;';

 
//DEBUG Level Info
 
if($cfdebuglevel >= 3) {
   
output("<br>Creating new node with Title: ".$node->title." from user: ".$node->uid." Creation Date: ". unixtime2date($node->created)."<br>");
  }

 
 
node_validate($newnode);
 
$node = node_submit($node);
 
$node->created = $nodecreated; //** Valid unix time stamp
 
$node->date = $nodecreated;
 
$node->timestamp = $nodecreated;
 
node_save($node);  //Actually save or edit the node now
 
content_insert($node);
 
 
$nid = $node->nid;

 
//DEBUG Level Info
 
if($cfdebuglevel >= 3) {
   
output("... done: new Node ID: ".$nid."<br>");
  }
?>

Best regards,

Chris

Great work chris

Great work chris,
It works for D6. You may get an error of function node_prepare not found. Just add the following code in the begining of your snippet

if( ! function_exists("node_object_prepare")) {
      include_once(drupal_get_path('module', 'node') . '/node.pages.inc');
   }

That will let you sail through. Infact i have used to load a CCK node with work flow and default states are created flawlessly..
Thanx

if you have a cck Date field

I found I had to comment out node_validate and content_insert in Chris's code to get a Date field inserted without throwing errors - if I leave content_insert in there - I get a duplicate key error and if I leave the node_validate in there - I get a "You have to specify a valid date error.'

Having said that I would have taken an eon to get this to work without Chris and the other folks putting int their valuable contribution. Thanks.

Multiple calls to drupal_execute() in the same HTTP request.

Note that you may encounter issues calling drupal_execute() multiple times in the same HHTP request, especially with CCK widget option fields.

See:

http://drupal.org/node/416126
and
http://drupal.org/node/260934

importing multiple terms from a a vocabulary?

is there a drupal 5 example of using drupal execute to save terms separated by a pipe | to the node array? or even a node save example?

thanks!

Multisite programatic node creation!

To add to Nicholas Thompson's addition (regarding CLI creation) the following is the key to multisite: $_SERVER['HTTP_HOST'] = before the drupal_bootstrap call.

I spent a while tracking this down and now it should be in the public record.