Want to get Lullabot article, videocast, and podcast announcements delivered right to your in-box?
Let us know your email address (we won't share it) and we'll let you know when anything exciting happens.
Quick-and-dirty CCK imports
April 13, 2007
Recently I needed to import a small (100-200) node set from an 'old' database to a CCK content type. I used the following code, which serves as a nice example of the drupal_execute() function.
<?php
$results = db_query("SELECT * from recipe");
while ($recipe = db_fetch_object($results)) {
// I was converting old style nodes to CCK nodes.
// If you're building them from scratch you would
// want to use: $node = array('type' => 'story'),
// where 'story' is the type of node you want to
// create.
$node = node_load($recipe->nid);
$values = array();
// You'll recognize this as the structure of
// form_values you'll see in a submit or validate
// handler. You're basically building it manually,
// rather than using $_POST from a user.
$values['field_servings'][0]['value'] = $recipe->yield;
$values['field_prep_time'][0]['key'] = $recipe->preptime;
$i = 0;
$ingredients = db_query("SELECT ingredient FROM
recipe_ingredients WHERE nid = %d ORDER BY weight ASC",
$recipe->nid);
while ($ing = db_fetch_object($ingredients)) {
$values['field_ingredients'][$i++]['value'] = $ing->ingredient;
}
// Multivalue CCK fields are handled this way --
// the name of the field, then a numerical key for
// each individual field instance, then a 'value'
// key. Single-value fields are actually handled
// the same way, but they only have the '0' delta.
$values['field_directions'][0]['value'] = $recipe->instructions;
$values['body'] = $recipe->notes;
$values['field_source'][0]['value'] = $recipe->source;
$values['status'] = 1;
//$values['promote'] = 1;
// 'recipe_node_form' would be 'story_node_form' or
// whatever node type you're creating.
drupal_execute('recipe_node_form', $values, $node);
}
?>





Comments
An alternative method...
If you're on linux (and for webservers, why use anything else?) you could make a PHP script.
The two lines of the file should be:
#!/usr/bin/php<?php
(assuming php is installed in /usr/bin)
Then, once you chmod the file to be executable by you (eg, chmod 700) you can just run the script as a command. If you put this script in your drupal install folder - you could do something like this...
<?php
exec("/usr/bin/clear");
include_once('includes/bootstrap.inc');
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
bootstrap_invoke_all('init');
ini_set('memory_limit', '512M');
echo "Bootstrap sucessfull\n\n";
//Authenticate as user 1
user_authenticate('admin', 'password');
/*
* Terms
* These are hard coded mappings of old Wordpress Category ID's to Drupal Term ID's...
*/
$terms = array(
8 => 195,
17 => 196,
4 => 197,
7 => 198,
11 => 199,
13 => 200,
16 => 201,
15 => 202,
20 => 203,
21 => 204,
9 => 205,
3 => 206,
19 => 207,
18 => 208,
6 => 209,
14 => 210,
10 => 211,
12 => 212,
);
//Get some Wordpress posts...
$result = db_query("SELECT ID, post_date, post_title, post_content, guid from {wp_posts} WHERE post_status='publish'");
while($post = db_fetch_object($result)) {
echo t("Importing '%title'... ", array('%title' => $post->post_title));
//Term Data
$result2 = db_query("SELECT category_id from {wp_post2cat } WHERE post_id=%d", $post->ID);
$post_terms = array();
while($t = db_fetch_object($result2)) {
if(isset($terms[$t->category_id])) {
$post_terms[] = $terms[$t->category_id];
}
}
$node = new StdClass();
$node->title = $post->post_title;
$node->type = 'blog';
$node->body = $post->post_content;
$node->status = 1;
$node->format = 1;
$node->moderate = 0;
$node->promote = 0;
$node->sticky = 0;
$node->revision = 0;
$node->name = 'whatsnew';
$node->comment = 0;
$node->date = $post->post_date;
$node->taxonomy = $post_terms;
$node = node_submit($node);
node_save($node);
path_set_alias("node/" . $node->nid, substr($post->guid, 26, -1));
echo " Done\n";
}
?>
The only issue is if you use pathauto - the path_set_alias will add your custom URL as a second URL to the pathauto one. Quite easy to fix - but bloody annoying!! ;-)
I used this recently. Imported 160 blog posts into Drupal in about 3 seconds.
I tried running
I tried running drupal_execute() as a command line script using this method, but form.inc doesn't like it. Seems something's missing, any ideas?
I'm importing data into 30.000+ CCK nodes on a site, and doing it through a browser isn't much fun...
Not having much luck...
Hey, I'm not having much luck with the above code. In particular, it's the taxonomy portion that I can't get going. what verion of drupal is the above code written for? I can't seem to import stuff into the DB with taxonomy data. ....
drupal_execute
I'm trying to understand the purpose for drupal_execute() (and I'm sure there is one), but AFAIK it's simply a way to submit data to Drupal programmatically, simulating an actual form submission?
Why not simply use node_save()?
Look forward to get my head around that one...
Full form life cycle
You're right -- it IS just a way to simulate a full form submission without any user input. There are several reasons to do things that way:
In general, drupal_execute, because it still goes through node_save() after all the other processing is done, is more robust when you have a bunch of modules installed that muck with the node edit form.
Makes perfect sense
So when one wants to shield data migration from other modules and/or processes, go straight to node_save...otherwise keep it extensible and use drupal_execute.
That will be handy in the future...so far I only had to deal with the first scenario.
Cheers
fair assessment
It's fair to say that if you wanted to block other modules out, using node_save is a way to achieve that.
Don't forget node validation though. That is also bypassed, and might cause errors that are hard to track.
Retreiving node nid
Thanks Jeff for this perfect article.
I have a question; Assume I have added a content using drupal_execute. How can I retrieve the nid of the node I have added? I am looking for a standard way. I can execute a query that gives me the last nid, but it does not look correct to me; because -maybe at the same time- another instance of the script has added another node; rare situation, but possible.
Thanks.
Insert hooks for getting nid
hook_insert should pass in a node with a node id. I needed to use the insert op under nodeapi ( http://api.drupal.org/api/function/hook_nodeapi/5) since the module was not registered for that cck type (the next step).
How about Importing a CCK image fields ?
Hi, How would this code be if we wanted to migrate some images alongside the posts.
importing images
Step 1: Make sure that the images are somewhere under the files/ directory
Step 2: Populate the files table using an sql INSERT query
fid => db_next_id('{files}_fid')
filename => use php function basename() for the image filename
filepath => relative path to the image on the drupal site
filemime => the mimetype, likely to be one of these: 'image/gif', 'image/jpeg', 'image/png'
filesize => use php function filesize() on the image file
Step 3: Load in the saved node object, and save the file id
$n = node_load($nid);$n->field_myimage[0]['fid'] = $fid;
node_save($n);
Step 4: update the files table with the node id (nid)
Thanks for the article! is
Thanks for the article! is there a counterpart drupal_execute in drupal 4.7??
How to create a menu item as well
Referring to the very first piece of code on this page, how can I go about creating a menu item for the node as well. Say I was driving the page type node form...
Exactly what I was looking for.
Albeit not importing from Wordpress, this is pretty much exactly what i've been looking for. I needed to imported flexinodes from an old DB into a new DB for a new website. I couldn't figure out how to do this up until this article. Thanks for the insight!
What about importing CCK Date fields
Hi
Great article- how would you modify to import CCK date fields.
DJ
Date fields
I have had several people reporting trouble getting this to work for date fields in the Date 5.2 version and after trying to get it to work myself I found out why.
Date 5.2 uses FAPI to send the form values to its validator by adding a #validate item to the form. The date module collects info in widgets that may not look like you would expect them to -- the select widget expects to be getting back an array with 'year', 'month', 'day', etc. The Date Popup widget expects to get back an array of 'date' and 'time'. And if you have a date with timezones, you need to provide the timezone name in the arrays. These values are validated and processed back into string or integer date values in the date validation function. If the validation function doesn't get the right info in the right format, it will fail, but to provide the right info you have to know exactly what is expected by the specific widget and field.
On top of that I ran into some sort of quirk in the D5 FAPI processing if you try to create a bunch of nodes at one time from an array of date values (see http://drupal.org/node/258572). When doing that I found that the validation functions don't get called at all after the first node is created, so all the subsequent nodes miss that processing and end up with the wrong values for the field and end up creating nodes with empty values. I suspect it is something about the way that the form is cached or the re-use of the token or something like that that makes the processor think no validation is needed, but I couldn't find a way to get drupal_execute to treat each new node as a completely fresh form. I'm sure there must be a way, I just couldn't find it.
Anyway, I've been telling people not to use drupal_execute to import dates and instead to use node_save(), which seems much simpler to use. I realize the reason for suggesting drupal_execute is so the values will get validated, but I'm not sure that is even reliably true, depending on how you're doing the import and how the validation is done. Using node_save means you need to do your own checking before you import the values that they are correct, but if you're importing values with a script, you should be able to check that the values are appropriate as a step in your processing.
The 5.1 version of the date module does not use #validate, so it doesn't have a validator choking on the wrong values. And the Date module may be the only D5 CCK field doing its own validation that way, so this may not be a problem in other fields.
Post new comment