Drupalcon: Zen and the Art of Drupal: The Philosophy of Drupal Development

How we work on Drupal

The philosophy that has guided the code in Drupal core is one of the biggest assets of the Drupal project. But it's never been disseminated in a structured way to the community and it's also evolved over time.

With the rapid growth of Drupal and the number of new people coming into the community, then it's more and more important that Drupal be more proactive and intentional in promoting the design philosophies that have helped fuel Drupal's growth and success. In the final session time slot of Drupalcon, Jeff Eaton and James Walker kicked off this conversation about "The Art of Drupal."

The first idea is that Drupal won!

It is no longer a question that Drupal can't be used. It's no longer the new kid on the block. Industries are starting to look to Drupal first, and that's awesome and exciting.

With all software, it's not perfect, and won't be used in every case every where. But it's now being used by very large businesses and communities.

We not only have legitimacy in being able to do big complicated projects, but also to be able to drive technology forward with innovation. There are other standards organizations like OpenID that are looking at what Drupal is doing and is something that we should be proud of.

James still remembers a basement in Antwerp, Belgium, when Drupal was still the new kid on the block.

Let's look at the lines of code as an indicator of our growth. In 2005, core Drupal was about 35k lines of PHP code. Today, it's 50k code with about 8% code -- and that's not even including comments or whitespace. And if you look at contrib, it went from 300k in 2005 lines to nearly 1.8 million lines of code today. Contrib has added about a million and half lines of code in those 3000 modules over the last three years, which doesn't even include different branched versions of the modules -- that's just the HEAD versions. This is the work of the community trying to solve interesting problems. So Drupal contrib is moving at a frighteningly-fast pace.

That's a 3-year sprint with things like CCK and Views getting a chance to evolve in contrib.

Before that, building websites looked very, very different and we can do some amazing things now. A lot of hard things have been solved in contrib like with Views and CCK, and the user relationships for mapping out social graph relationships, and we've ironed out a lot of things and changed how those things get built.

You used to have to grab core, and then the contrib modules were actually named by the feature that they gave you -- like "event.module." Now it's more like a lego-building block approach of taking small bits of building blocks and putting it together.

The Form API is a huge win that now allows other modules to plug into each other, and a bunch of modules can combine with each other in the process of capturing clustered user input.

Modules are interacting together to create an aggregate experience. You have to throw five modules together for building an image gallery instead of just one.

Event module still exists and is still used, but the direction is moving towards the small pieces being glued together, which creates new sticky problems.

We're moving towards more importance on the fields and a finer-grain focus.

* Nodes vs. Everything -- Should users be nodes? There are solutions in contrib that are layered on other solutions * Nodes vs. Fields * Output formats (XML, JSON, RDF) to tie into other web standards -- these are questions that are starting to be answered in the contrib module world. These are more layers of complexity that we're starting to build on top of other layers. It's not an easy thing, especially when you're coordinating volunteers from all over the planet. * Configuration management

How do we do that in a way that won't kill us in a way when we start to build a big site? Or start to build towards solving the other tough problems?

We can only stack so many dishes on top of other decisions before we need to go back and re-architect some fundamental issues.

Design Debt: increases if you do not refactor: until the cost of adding new features is greater than it would be to code them from scratch. This is why Drupal is so willing to break APIs in order to prevent the design debt from building up.

QUESTION: As commercial entities enter, they want slower releases.

Most people want faster releases according to Dries' slides. From a pure project perspective, it's one of the ??? things we face.

The problem isn't building cool websites or aggregating content. The next challenge is how do we make the Drupal design, and how the pieces click together, and how do we catch that up with the other 3-year code sprint.

This is the beginning of a conversation that needs to start, and we'll be pointing to some existing articles that have been written about these things.

FIVE POINTS -- Here are some general principles that we've been thinking about:

1.) Code is for People. A lot more time is going to be with people looking at the code than you're going to spend writing it. So code clarity is very, very important.

I've released a lot of codes without any documentation, so Eaton is part of the problem with this. Part of the problem is that we all have to understand all of the pieces, whether it's by a function-by-function, or whether it's at the API-level, it's critical that we consider that our code is part of a collective conversation of how websites get built and how to solve problems. This doesn't mean that you have to write a 50-page manuel for each module that you write, and that we shouldn't release a solution that isn't pretty. One of the main goals is for other people to understand what we write.

2.) Loose Coupling means Happy Code An example of how this applies in Drupal, all of the bricks don't know they're part of a lego car, they just know what's immediately around them. On the other hand, the other car on the other side was designed to be a red car.

As we build pieces that connect to each other, they have to have knowledge of what they can connect with. As we're building pieces that connect to each other, the fewer assumptions of what will be there, then the better.

For example, image cache resizes images and image field takes an image and puts in a field on a node, and they communicate very well with each other.

With both of these first two concepts is that when you contribute something to a community that is this size, then they're going to take that code and try to do something with it. The clearer that you've said how it'll work and the less assumptions that you make, then the better. We are usually scratching our own itch in a specific scenario, and if you contribute it back, then other people are going to be using it in a different scenario.

With image field and image cache, it would've been easier to build image cache on top of image field, but since it's split from each other, then other people have been able to use image cache with the profile.module, and it's a much more useful tool that way.

3.) Love your Hacks No real actual website that has to ship someday -- will always have code that specializes. It'll have to have a hack, and you can love them as hacks and not as lego pieces. Remember when you're doing a hack vs. building a lego piece for someone else. Sometimes it's best to say it's site-specific and interesting that made my site better and faster, and it could be spun off in a more generalized way later. This is one of the ways that our community way and value of sharing can come back and bite us.

Sometimes these things would make good articles or blog posts, but aren't worthy as standalone modules. You're going to spend more time to take the hack that worked in your environment and try and put it out in community.

Like to make a rotating banner, then he just make a view, make an image field, and then write 10 lines of PHP code to auto-switch.

Love your hacks, but document them for why you did it. So that five months later when a security patch comes out, then you can go back and remember. Or when you have someone else is coming in and have to deal with your hacks.

4.) Layers Protect You When you're building a module that does something complicated or exposes an API for awesomeness, building layers of functionality protects you and others. If you expose 100% of every single edge case -- then although it may be awesome and you want to solve cases -- but the issue is that it's own module to expose the data to other blocks and node types. Separating those things from each other is a very good thing. It creates a better design, because it's shielding you from the complexity of an XML-based webservice like Amazon.com. Then may have CCK fields and other modules that do one special thing, the stacked layers are good. But they need to be designed to work that way.

If you start with the layer with Amazon integration, and you have the first layer of the guts - integrating with the service, which is unit testable, and you can see that the bottom layer is working properly and see what changes to the lower layers have. You want to build on top of a strong foundation, and not on top of a house of cards.

The take-home point is that when you build tools that other people will be building things on top of and iterating with other modules. Then you need to be sure to think about what is the bottom layer and who's job is it to serve what type of node type. Then it's easier to change the user interface as well.

5.) Simple Means Less From the usability study results, there's no way to make something simpler than by taking things away. You can shuffle around, but sometimes the only way to make it simpler is to take them away. The better your layers are, the easier they are to strip off. Layered code exists in a lot of places.

Modules that are the heart of many sites are written like that. Views Core and the Views UI and fairly separated out, and it's possible to have a different UI for views. That's what we're trying to encourage.

In some parts of Drupal, we have kept on adding lots of information in collapsible field set over the years, and now have 747 cockpit syndrome on our admin pages.

In Drupal 6 with Actions and Triggers interface, there's a lot more to the system that isn't integrated in the UI.

Another important issue is that this is not an information dump. Design is a conversation in this open source project. How data storage should work? How should views be integrated? How should we interact with outside system with JSON, XML, etc? These are the types of questions that we should be thinking about when we remember that we should write code for other people, keep our code loosely coupled, love our hacks for what they are, separate functionality with different layers, and keep things simple when possible by stripping out edge case functionality.

There will be a series of articles that eaton and walkah will be writing about this that will continue this discussion.

They're serious issues, and we can't ignore the design debt. And the fact that we can deal with them now means that we have succeeded and it can allow us to think about the future.

QUESTION: Is there a model to capture site configurations in some sort of way?

Yes, the Patterns.module is one that is trying to capture that -- http://drupal.org/project/patterns

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!