Alfresco’s Hibernate Domain Model

February 27, 2010 at 7:26 pm (Alfresco, Alfresco Activity, Alfresco Attribute, Alfresco Audit, Alfresco Locale, Alfresco Node, Alfresco Patch, Alfresco Permission, Alfresco QName, Alfresco Transaction, Alfresco Usage, Alfresco Version, Domain Model, Hibernate)

Since Hibernate (http://www.hibernate.org/) is used to implement persistence in Alfresco, I sat down last evening and studied it’s mapping files in order to determine what the domain model looks like.  Here are my findings:

Figure 1: Overview of Alfresco Domain Model

Generally speaking, Alfresco domain entities can be grouped into a number of functional packages:

Nodes
Contains the entities used to represent objects that are stored in the Alfresco repository. Types and Aspects are nodes in this definition.
Permissions
Contains the entities used to represent security objects in Alfresco. Permissions, authorities, ACEs and ACLs are examples of such objects.
Attributes
Contains the entities used to represent Alfresco object attributes. There are different kinds of attributes depending on the data types of values assigned to the attributes.
Version Counts
Contains the entities used to represent Alfresco node version information. Version counts are associated with a given Alfresco store.
Usage Deltas
Contains the entities used to represent the storage information of an Alfresco node (i.e. size, etc.) Used to manage user quotas, etc.
Transactions
Contains the entities used to represent the transaction history of a given node. Transactions are tied to servers.
Activities
Contains the entities used to represent the activities of a user on a given (Share or WCM?) site.
Applied Patches
Contains the entities used to represent the patches that have been applied to a particular Alfresco instance.
Locales
Contains the entities used to represent a persistent Locale entity.
QNames
Contains the entities used to represent the QName of an Alfresco node.
Audits
Contains the entities used to represent the audit history of an Alfresco object.

The three packages I am interested in look at are Nodes, Permissions, and Attributes.

Figure 2: Alfresco Node Model

The fundamental entity of the Hibernate domain model is the Node, which represents an object that can be persisted in the Alfresco repository. Nodes can have one or more properties assigned to them as well as a distinct set of aspects, and every node can be secured by an Access Control List. In addition, Nodes are persisted within a given store of an Alfresco repository, all are associated with a given transaction, and all can participate in two different kinds of relationships with other nodes — a parent-child relationship, and a source-target relationship.

Some additional comments:

  • I don’t know that I have modeled Aspects and Properties right. In the source code, they are defined as Set<Long> and Map, respectively. But how these two are associated with other nodes or attributes (which they must be) I am not able to answer without looking at the code some more.
  • It seems to me that Nodes must also have QNames assigned to them, but this fact isn’t depicted in the figure above. Again, I will have to study the code some more to determine the relation between Nodes and QNames.

Figure 3: Alfresco Permission Model

In the Alfresco Permission Model, an Access Control Entry associates an Authority with a Permission in a specified security context. It can also be a member of any number of Access Control Lists within Alfresco. An Alias can be made of any Authority within the system.

Figure 4: Alfresco Attribute Model

The Attributes domain model looks complicated, but it really isn’t. A different subclass of the base Attribute entity is created depending on the data type of the value it contains. If the value is one of Java’s primitive data types, it’s type will be one of the following:

  • ByteAttribute
  • BooleanAttribute
  • DoubleAttribute
  • FloatAttribute
  • IntAttribute
  • LongAttribute
  • ShortAttribute
  • StringAttribute

If it’s value is not one of the primitive datatypes, it’s type will be one of the following:

  • SerializableAttribute
  • ListAttribute
  • MapAttribute

Each attribute can be assigned an Access Control List to secure it, and it’s reference can be stored in a Global Attribute Table for quick lookup. I have to research the code some more before I could tell how ListEntry and MapEntry are used in Alfresco.

Permalink Leave a Comment

Types and Aspects in Alfresco

February 21, 2010 at 8:13 pm (Alfresco, Aspect, Content Model, Content Type) (, , , )

I have been working with the Alfresco ECM (http://www.alfresco.com/) for about 8 months now and have learned that in order to use it effectively, you need to develop a content model that is accurate, light-weight, and extensible — and the development of this model must be managed by a governance body. Unless these conditions are met, the cohesiveness of the content model can fracture making it hard to change later on, the ability to find documents and manage their retention policies can be compromised, and the steering committees can be hindered in their ability to chart future directions for an organization.

One thing a person can do to develop a good content model is to know the difference between a folder type, a content type, and an aspect — and then to develop a content model with this understanding in mind. The following snippet of code (from systemModel.xml) will help us here:

<type name="sys:base">
 <title>Base</title>
 <mandatory-aspects>
	<aspect>sys:referenceable</aspect>
 </mandatory-aspects>
</type>

<aspect name="sys:referenceable">
 <title>Referenceable</title>
 <properties>
	<property name="sys:store-protocol">
	   <type>d:text</type>
	   <mandatory enforced="true">true</mandatory>
	</property>
	<property name="sys:store-identifier">
	   <type>d:text</type>
	   <mandatory enforced="true">true</mandatory>
	</property>
	<property name="sys:node-uuid">
	   <type>d:text</type>
	   <mandatory enforced="true">true</mandatory>
	</property>
	<property name="sys:node-dbid">
	   <type>d:long</type>
	   <mandatory enforced="true">true</mandatory>
	</property>
 </properties>
</aspect>

as will the following (from contentModel.xml):

<type name="cm:cmobject">
 <title>Object</title>
 <parent>sys:base</parent>
 <properties>
	<property name="cm:name">
	   <title>Name</title>
	   <type>d:text</type>
	   <mandatory enforced="true">true</mandatory>
	   <index enabled="true">
		  <atomic>true</atomic>
		  <stored>false</stored>
		  <tokenised>both</tokenised>
	   </index>
	   <constraints>
		  <constraint ref="cm:filename" />
	   </constraints>
	</property>
 </properties>
 <mandatory-aspects>
	<aspect>cm:auditable</aspect>
 </mandatory-aspects>
</type>

<type name="cm:folder">
 <title>Folder</title>
 <parent>cm:cmobject</parent>
 <archive>true</archive>
 <associations>
	<child-association name="cm:contains">
	   <source>
		  <mandatory>false</mandatory>
		  <many>true</many>
	   </source>
	   <target>
		  <class>sys:base</class>
		  <mandatory>false</mandatory>
		  <many>true</many>
	   </target>
	   <duplicate>false</duplicate>
	   <propagateTimestamps>true</propagateTimestamps>
	</child-association>
 </associations>
</type>

<type name="cm:content">
 <title>Content</title>
 <parent>cm:cmobject</parent>
 <archive>true</archive>
 <properties>
	<property name="cm:content">
	   <type>d:content</type>
	   <mandatory>false</mandatory>
	   <!-- Although content is marked as indexed atomically it may end up asynchronous -->
	   <!-- if the content conversion will take too long. Content that does not require conversion -->
	   <!-- to UTF8 test/plain will always be indexed atomically -->
	   <index enabled="true">
		  <atomic>true</atomic>
		  <stored>false</stored>
		  <tokenised>true</tokenised>
	   </index>
	</property>
 </properties>
</type>

<aspect name="cm:auditable">
 <title>Auditable</title>
 <properties>
	<property name="cm:created">
	   <title>Created</title>
	   <type>d:datetime</type>
	   <protected>true</protected>
	   <mandatory enforced="true">true</mandatory>
	   <index enabled="true">
		  <atomic>true</atomic>
		  <stored>false</stored>
		  <tokenised>both</tokenised>
	   </index>
	</property>
	<property name="cm:creator">
	   <title>Creator</title>
	   <type>d:text</type>
	   <protected>true</protected>
	   <mandatory enforced="true">true</mandatory>
	</property>
	<property name="cm:modified">
	   <title>Modified</title>
	   <type>d:datetime</type>
	   <protected>true</protected>
	   <mandatory enforced="true">true</mandatory>
	   <index enabled="true">
		  <atomic>true</atomic>
		  <stored>false</stored>
		  <tokenised>both</tokenised>
	   </index>
	</property>
	<property name="cm:modifier">
	   <title>Modifier</title>
	   <type>d:text</type>
	   <protected>true</protected>
	   <mandatory enforced="true">true</mandatory>
	</property>
	<property name="cm:accessed">
	   <title>Accessed</title>
	   <type>d:datetime</type>
	   <protected>true</protected>
	   <index enabled="true">
		  <atomic>true</atomic>
		  <stored>false</stored>
		  <tokenised>both</tokenised>
	   </index>
	</property>
 </properties>
</aspect>

According to the Alfresco content model, folders (cm:folder) and documents (cm:content) are different kinds of content management objects (cm:cmobject) that can be stored and referenced within the repository (i.e. they subtype sys:base.) As content management objects, both can be audited within the system and both have names that are tokenized for fast lookup using Lucene. Each can also be archived. In fact, the only different between a folder and a document is that the former can contain one or more sys:base types while the latter can only reference a single file present in the repository.

Aspects, on the other hand, enhance content types by supplementing their properties and behaviours. They do not have their own identifiers and cannot be stored or referenced apart from their types. They are also cross-cutting in the sense that they can be associated with any type within a content model — in fact, more than one type, in cases where the behaviour of a given aspect can be applied in a meaningful way to more than one type. In the code snippet above, sys:referenceable and cm:auditable are aspects that are applied to sys:base and cm:cmobject content types, respectively.

Now that we know what folders, documents, and aspects are in Alfresco, we can lay down a number of rules of thumb that can be used used to develop an effective content model in Alfresco:

  • If you are trying to model a document within your organization such as a report card, an inter-office memo, a tax form, etc., define it as a cm:content type if you don’t need to attach any additional metadata to it, or subtype cm:content if you do. If you need to associate a government-issued filing number with a tax-form, for instance, you could model it as follows:
    <type name="tax-form:t4Type">
        <title>CCRA T4 Form</title>
        <parent>cm:content</parent>
        <properties>
            <property name="tax-form:filingNumber">
    	   <type>d:text</type>
            </property>
        </properties>
    </type>
    
  • If you are trying to model a folder, i.e. a node that contains one or more sys:base types, define it as a cm:folder type if you don’t need to attach any additional metadata to it, or subtype cm:folder if you do. For instance, if you need to know the name of the project that a folder belongs to, you could define it’s content type as follows:
    <type name="projects:projectFolderType">
        <title>Project Folder</title>
        <parent>cm:folder</parent>
        <properties>
            <property name="projects:projectName">
    	   <type>d:text</type>
            </property>
        </properties>
    </type>
    
  • If you know that an object has one or more properties, and that a subset of these properties can be used to identify the object within a business process, then chances are the object should be modeled as a content type.
  • If you are not sure whether an object has properties, but you do know that the object needs to be stored and retrieved directly to and from the repository, then chances are the object should be modeled as a content type. Study your use cases to learn what it’s business identifier might be.
  • If you are certain the object you are modeling does not contain any properties, then the object should be modeled as an aspect since it does not have the values needed to form a business identifier. For instance, Alfresco uses the following aspect to indicate that an object is classifiable (see contentModel.xml):
    <aspect name="cm:classifiable">
        <title>Classifiable</title>
    </aspect>
    

    Such aspects are similar to marker interfaces in Java.

  • If the object you are modeling has properties and you are confident that none of them together form an identifier of any sort (an particular, a business identifier), then chances are good that the object is probably an aspect that is used by some business process to perform an activity on a particular content type. This is true of the cm:auditable aspect above. None of the Created, Creator, Modified, Modifier, or Accessed properties can be used singly or in combination to define a business identifier. But they can be used to generate audit reports for the content types that the aspect is applied to.
  • If the object you are modeling does not have a lifecycle or retention policy, then you are probably modeling an aspect that can’t live apart from a content type.

Permalink Leave a Comment

Follow

Get every new post delivered to your Inbox.