Documentum on AIX

27 May 2013

We run Documentum on AIX. Under stress the application performed not well. Looking more deeply into the Java, we saw a lot of garbage collection and waits. The default garbage collection on AIX is OPTTHRUPUT. This is not the best for applications such as Documentum with only quick objects. The default is good for batch operations, so large objects and more time.

So we change to the more appropriate GENCON and it went smoother, less GC and less waits. But now we have to look at the application because some code is still not performing, but I leave that to the Java/Documentum guys.

Oracle 11g and Documentum (Part 4)

20 May 2013

Long time nothing mentioned, but we got further with Oracle. The first major task was the upgrade to Oracle 11.2. It isn’t the hardest task but must be done. It is nothing more than shutting down, installing new Oracle software and do the upgrade. That’s all. We went to 11.2.0.3.

 

What brought this to us? Well, we had many more queries that went wrong in this version, but still some did. We observed that the Documentum standard queries did well, but some newly coded ones not. After looking at these, sometimes a rewrite was the best, sometimes a new index must be made. At the end we had only 2 left, we solved it by using baselines in Oracle.

 

One thing to mention, is that all above was done in an testing environment. When we go live, maybe some other queries goes mad, and we still need some tuning. But that’s for later.

 

Google style search

14 July 2012

I wanted to let you know that on the Oracle Search text Roger Ford is developing a Google-style search package to let you allow to implement a Google style search in your application. Is is stille beta but give it try if you need such a search.

See: https://blogs.oracle.com/searchtech/entry/oracle_text_query_parser

New features Oracle Text 11g part 1: Mixed queries

8 December 2010

Oracle 11g gave Oracle Text a boost. Oracle added many new features. In the upcoming posts I will give the new features of Oracle Text 11g.

The first one I want to show is the feature called “Mixed Queries”. The feature is added to the Context Index, it was already an option for Cxtcat Index. By using “Mixed Queries”, you include in the text index not only the column on which you want to search but also on extra fields, on which you normally would put an index and on which you want an additional search.

Example:

select title from books
where contains(content,'oracle') > 0
and   published_year = 2006
and   price < 40
order by price asc

The above query searches for book containing the word oracle in the content, published in the year 2006 and it price is below €40 (yes. I’m in Europe), ordered by the price. In Oracle 10g and before you could index the column content with a Context index, but nor the other 2 columns (Ok, in 10g you have the option to use MDATA but only for equality). In 11g you have the option to include both columns in the Context Index, so Oracle can find them in the same index.

So in versions before 11g we made the following indexes:

CREATE INDEX book_year_ind01 ON books (year);

CREATE INDEX book_price_ind01 ON books (price);

CREATE INDEX book_content ON books (content) INDEXTYPE IS
ctxsys.context;

In 11G we can make this work in one single index:

CREATE INDEX book_content ON books(content)
INDEXTYPE IS ctxsys.context
FILTER BY year, price
SORT BY price;

So now you are done with one index. Oracle will make now a so-called SDATA section, you will find it creates an extra table for the index namely a table with at the end $S.

However, if you want to search lonely on the year, it can still be necessary to create a normal index on year. It will be more efficient then the text index.

Synchronize Oracle Text index

18 September 2010

An often asked question is how to synchronize an Oracle Text Context index (A ctxcat index is automatically synchronized) , or otherwise asked, when does an insert/update/delete on the column(s) which is indexed come available for a search. There are more options depending on the version:

  1. Manual with ctx_ddl.sync_index (8 and above)
  2. With a job calling ctx_ddl.sync_index (8 and above)
  3. Synchronize through ‘Sync ‘ (10 and above)
  4. Manual with ‘Alter index’

Manual with ctx_ddl.sync_index

This is the oldest method. You call the procedure with the name of the index and it will sync the index. There is an option for a second parameter, which is the memory available for the synchronization. If there are a lot of updates to do, it is the best to set here some value. Otherwise the default is the value found in ctx_parameters for default_index_memory.

Ctx_ddl.sync_index called in a job

The manual update cost a lot of management, when must I do it and so on. Therefore it is better to schedule the synchronization. This can be done with dbms_job or in newer versions of the Oracle database with dbms_scheduler. By using these you don’t have less to manage yourself. With this job you can self design when the jobs needs to run.

Synchronize through ‘Sync’

Since 10g it is possible by creating the index to set the characteristics of a synchronization. This can be done by setting the sync option within the parameters section. There are 3 options:

  • Manual: You still have to call ctx_ddl.sync_index
  • Every ‘interval': with this option you create a job (just as the second here above mentioned option). “interval” has the same syntax as just by the dbms_scheduler package. For this option the user needs ‘Create job’ rights.
  • on Commit: This option will at every commit sync the index. Use this option only if you don’t have too many changes on the columns of the index. The transaction will wait till the index is synchronized.

Manual with ‘Alter index’

This is the less used option, because the whole index will be rebuild. Because of the complete rebuild, it cost time and resources. An advantage is that the index also is optimised afterwards. Use this option only if you need to make changes to the index itself, as some setting of parameters, or a new tablespace. Otherwise use one of the above.

Synchronization and when

It is possible to see which indexes needs to be synchronized. There is the view CTX_PENDING. In this view are the indexes with the rowids of the records who need to be synchronized. So you can possible query this view and see which indexes needs a sync. This can of course also be scheduled.

After an optimisation an index is not optimised, meaning the index can be bigger than necessary is. But about that I will talk later.

Oracle 11g and Documentum (Part 3)

7 September 2010

So how did we go further? By using Oracle Enterprise Manager Grid Control (OEMGC), the queries which cause performance penalties, could be found quickly and so some extra indexes were made. But that wasn’t the only thing we saw during test. We also saw some queries running twice instead of running once. It happens for example by changing from one tab to another. Sometimes a query from the first tab was also run, before the second tab. For this behaviour we had to get support from EMC.

For other queries we need to check further, and came to the problem what Oracle has when FORCE is used for CURSOR_SHARING. The Oracle Optimizer doesn’t always like histograms with the setting of the parameter. After looking at the queries the histograms on the table DM_GROUP_R were the problem. So the statistics on the table were collected without the histogram and then these statistics were locked.

Use of Text indexes

30 May 2010

For what can we use these text indexes? There are many options:

  1. Large text parts (also XML)
  2. Binary documents (Word, PDF and many more)
  3. Case insensitive search
  4. Diacritic search
  5. Fuzzy search
  6. Stemming search

The above are the ones in which I choose for a text index. There are many more possibilities, such as position and score of a search.

The first two I think are the most used options for text, the other are less used, which is sadly, because option 3 and 4 are very simple to implement with Oracle text.

The third option is the one where many people choose for a function-based index (theu use upper or lower to get the case insensitive search), but Oracle Text is per default case-insensitive. The fourth option you can include in the text so you can immediately search without diacritics to find the term with a diacritic letter. This can be very useful for languages which use diacritics (German, French, Hungarian and many more). Fuzzy search is the search for words that looks similar, normally because of spelling errors. Fuzzy search is not possible for all languages supported for Oracle Text. Stemming search is the option to search for words which have the same linguistic root. So you can search cat and you also find cats. Stemming also is not possible for all languages supported for Oracle Text.

Later on I will give some examples how you can do some of the above.

Oracle Text Start

23 April 2010

So my intention was to write a lot about Oracle Text in my blog. I use it now for over 2 years in application, and every time I find new options and new opportunities in it. I think Oracle Text is an underestimated feature/option or even maybe a technology of the Oracle Database. The advantage is also, it exists in all edition of the database, from XE to Enterprise Edition. And it is included in the license. So why don’t you start to use it.

Every user created in the database can use an Oracle Text index. There are 4 types of indexes in Oracle Text:

  1. Context – used mostly for large (text) documents, it is the most powerful index
  2. Ctxcat – used for mixed queries, so small documents with also other columns of a table. It has not al the options for Context indexes
  3. Ctxrule – only used for classification or routing
  4. Ctxxpath – only used to index ExistsNode() queries on XMLTYPE columns. For 11g and higher it is recommended to look at the new index XMLINDEX (not an Oracle Text Index). Ctxxpath will be deprecated.

An Oracle Text index will almost identically created as a normal index, you use CREATE INDEX. The difference is you need to make it aware of the Oracle Text index. For a context index the simplest syntax is:

create index textindex on testtable
(testcolumn)
indextype is ctxsys.context;

So you see you need to include the last line to get Oracle aware of the use of an text index. This index will use the default settings for a text index. These defaults can be seen in the view CTXSYS.CTX_PARAMETERS. It is possible to set new defaults for all these values, these will be used for the next new to create indexes. For this change you need extra privileges: you need the CTXAPP role and additional grants to execute packages owned by CTXSYS, but I will later explain.

How can you use such a index? You can now use index in SQL, but it has it own syntax. The syntax for a context index includes the word CONTAINS, a ctxcat index includes CATSEARCH and a ctxrule index includes MATCHES. So an example for a context index is:

select * from testtable
where contains(testcolumn,’test’) > 0;

So we have now a start, later more about Text Indexes and how they can be used.

Oracle text sources

17 April 2010

The page about Oracle Text on Technet is nowadays hard to find. It is not found on the products page, it is not listed as an option in the database, but fortunately it is there, see http://www.oracle.com/technology/products/text/index.html. Is has pdf’s about the features of Oracle 11g, some white papers and a link to the sample codes. So it is still a good source to start with Oracle Text.

The other source for starting with Oracle text is on docs.oracle.com. Here you can find all the manuals for the products of Oracle based on versions. There are 2 manuals for Oracle Text in 11g:

Both give you an idea of the possibilities of Oracle Text.

Oracle 11g and Documentum (part2)

17 April 2010

By not setting the given parameters for Documentum according to their documentation, we need to be sure the statistics in Oracle should be up-to-date. Beside the normal stats on the tables and indexes, the standard night batch is ok, we need also to be sure that the system statistics are ok. This comes in two ways, with workload and no-workload. For the start no-workload is ok, later as in production, workload can be used on a busy system. So I started with the next statement, performed by SYS:

begin
  dbms_stats.gather_system_stats();
end;

After that some queries which went wrong, went better. So later I will perform the workload statement if we go to acceptance and production. Now back for searching the needed indexes, but I will be sure, our testers will also complain about performance, so it will not hard to find the missing ones.


Follow

Get every new post delivered to your Inbox.