Oracle 11g and Documentum (Part 4)

20 May 2013

Long time nothing mentioned, but we got further with Oracle. The first major task was the upgrade to Oracle 11.2. It isn’t the hardest task but must be done. It is nothing more than shutting down, installing new Oracle software and do the upgrade. That’s all. We went to 11.2.0.3.

 

What brought this to us? Well, we had many more queries that went wrong in this version, but still some did. We observed that the Documentum standard queries did well, but some newly coded ones not. After looking at these, sometimes a rewrite was the best, sometimes a new index must be made. At the end we had only 2 left, we solved it by using baselines in Oracle.

 

One thing to mention, is that all above was done in an testing environment. When we go live, maybe some other queries goes mad, and we still need some tuning. But that’s for later.

 

Google style search

14 July 2012

I wanted to let you know that on the Oracle Search text Roger Ford is developing a Google-style search package to let you allow to implement a Google style search in your application. Is is stille beta but give it try if you need such a search.

See: https://blogs.oracle.com/searchtech/entry/oracle_text_query_parser

New features Oracle Text 11g part 1: Mixed queries

8 December 2010

Oracle 11g gave Oracle Text a boost. Oracle added many new features. In the upcoming posts I will give the new features of Oracle Text 11g.

The first one I want to show is the feature called “Mixed Queries”. The feature is added to the Context Index, it was already an option for Cxtcat Index. By using “Mixed Queries”, you include in the text index not only the column on which you want to search but also on extra fields, on which you normally would put an index and on which you want an additional search.

Example:

select title from books
where contains(content,'oracle') > 0
and   published_year = 2006
and   price < 40
order by price asc

The above query searches for book containing the word oracle in the content, published in the year 2006 and it price is below €40 (yes. I’m in Europe), ordered by the price. In Oracle 10g and before you could index the column content with a Context index, but nor the other 2 columns (Ok, in 10g you have the option to use MDATA but only for equality). In 11g you have the option to include both columns in the Context Index, so Oracle can find them in the same index.

So in versions before 11g we made the following indexes:

CREATE INDEX book_year_ind01 ON books (year);

CREATE INDEX book_price_ind01 ON books (price);

CREATE INDEX book_content ON books (content) INDEXTYPE IS
ctxsys.context;

In 11G we can make this work in one single index:

CREATE INDEX book_content ON books(content)
INDEXTYPE IS ctxsys.context
FILTER BY year, price
SORT BY price;

So now you are done with one index. Oracle will make now a so-called SDATA section, you will find it creates an extra table for the index namely a table with at the end $S.

However, if you want to search lonely on the year, it can still be necessary to create a normal index on year. It will be more efficient then the text index.

Synchronize Oracle Text index

18 September 2010

An often asked question is how to synchronize an Oracle Text Context index (A ctxcat index is automatically synchronized) , or otherwise asked, when does an insert/update/delete on the column(s) which is indexed come available for a search. There are more options depending on the version:

  1. Manual with ctx_ddl.sync_index (8 and above)
  2. With a job calling ctx_ddl.sync_index (8 and above)
  3. Synchronize through ‘Sync ‘ (10 and above)
  4. Manual with ‘Alter index’

Manual with ctx_ddl.sync_index

This is the oldest method. You call the procedure with the name of the index and it will sync the index. There is an option for a second parameter, which is the memory available for the synchronization. If there are a lot of updates to do, it is the best to set here some value. Otherwise the default is the value found in ctx_parameters for default_index_memory.

Ctx_ddl.sync_index called in a job

The manual update cost a lot of management, when must I do it and so on. Therefore it is better to schedule the synchronization. This can be done with dbms_job or in newer versions of the Oracle database with dbms_scheduler. By using these you don’t have less to manage yourself. With this job you can self design when the jobs needs to run.

Synchronize through ‘Sync’

Since 10g it is possible by creating the index to set the characteristics of a synchronization. This can be done by setting the sync option within the parameters section. There are 3 options:

  • Manual: You still have to call ctx_ddl.sync_index
  • Every ‘interval’: with this option you create a job (just as the second here above mentioned option). “interval” has the same syntax as just by the dbms_scheduler package. For this option the user needs ‘Create job’ rights.
  • on Commit: This option will at every commit sync the index. Use this option only if you don’t have too many changes on the columns of the index. The transaction will wait till the index is synchronized.

Manual with ‘Alter index’

This is the less used option, because the whole index will be rebuild. Because of the complete rebuild, it cost time and resources. An advantage is that the index also is optimised afterwards. Use this option only if you need to make changes to the index itself, as some setting of parameters, or a new tablespace. Otherwise use one of the above.

Synchronization and when

It is possible to see which indexes needs to be synchronized. There is the view CTX_PENDING. In this view are the indexes with the rowids of the records who need to be synchronized. So you can possible query this view and see which indexes needs a sync. This can of course also be scheduled.

After an optimisation an index is not optimised, meaning the index can be bigger than necessary is. But about that I will talk later.

Oracle 11g and Documentum (Part 3)

7 September 2010

So how did we go further? By using Oracle Enterprise Manager Grid Control (OEMGC), the queries which cause performance penalties, could be found quickly and so some extra indexes were made. But that wasn’t the only thing we saw during test. We also saw some queries running twice instead of running once. It happens for example by changing from one tab to another. Sometimes a query from the first tab was also run, before the second tab. For this behaviour we had to get support from EMC.

For other queries we need to check further, and came to the problem what Oracle has when FORCE is used for CURSOR_SHARING. The Oracle Optimizer doesn’t always like histograms with the setting of the parameter. After looking at the queries the histograms on the table DM_GROUP_R were the problem. So the statistics on the table were collected without the histogram and then these statistics were locked.

Use of Text indexes

30 May 2010

For what can we use these text indexes? There are many options:

  1. Large text parts (also XML)
  2. Binary documents (Word, PDF and many more)
  3. Case insensitive search
  4. Diacritic search
  5. Fuzzy search
  6. Stemming search

The above are the ones in which I choose for a text index. There are many more possibilities, such as position and score of a search.

The first two I think are the most used options for text, the other are less used, which is sadly, because option 3 and 4 are very simple to implement with Oracle text.

The third option is the one where many people choose for a function-based index (theu use upper or lower to get the case insensitive search), but Oracle Text is per default case-insensitive. The fourth option you can include in the text so you can immediately search without diacritics to find the term with a diacritic letter. This can be very useful for languages which use diacritics (German, French, Hungarian and many more). Fuzzy search is the search for words that looks similar, normally because of spelling errors. Fuzzy search is not possible for all languages supported for Oracle Text. Stemming search is the option to search for words which have the same linguistic root. So you can search cat and you also find cats. Stemming also is not possible for all languages supported for Oracle Text.

Later on I will give some examples how you can do some of the above.

Oracle Text Start

23 April 2010

So my intention was to write a lot about Oracle Text in my blog. I use it now for over 2 years in application, and every time I find new options and new opportunities in it. I think Oracle Text is an underestimated feature/option or even maybe a technology of the Oracle Database. The advantage is also, it exists in all edition of the database, from XE to Enterprise Edition. And it is included in the license. So why don’t you start to use it.

Every user created in the database can use an Oracle Text index. There are 4 types of indexes in Oracle Text:

  1. Context – used mostly for large (text) documents, it is the most powerful index
  2. Ctxcat – used for mixed queries, so small documents with also other columns of a table. It has not al the options for Context indexes
  3. Ctxrule – only used for classification or routing
  4. Ctxxpath – only used to index ExistsNode() queries on XMLTYPE columns. For 11g and higher it is recommended to look at the new index XMLINDEX (not an Oracle Text Index). Ctxxpath will be deprecated.

An Oracle Text index will almost identically created as a normal index, you use CREATE INDEX. The difference is you need to make it aware of the Oracle Text index. For a context index the simplest syntax is:

create index textindex on testtable
(testcolumn)
indextype is ctxsys.context;

So you see you need to include the last line to get Oracle aware of the use of an text index. This index will use the default settings for a text index. These defaults can be seen in the view CTXSYS.CTX_PARAMETERS. It is possible to set new defaults for all these values, these will be used for the next new to create indexes. For this change you need extra privileges: you need the CTXAPP role and additional grants to execute packages owned by CTXSYS, but I will later explain.

How can you use such a index? You can now use index in SQL, but it has it own syntax. The syntax for a context index includes the word CONTAINS, a ctxcat index includes CATSEARCH and a ctxrule index includes MATCHES. So an example for a context index is:

select * from testtable
where contains(testcolumn,’test’) > 0;

So we have now a start, later more about Text Indexes and how they can be used.

Oracle text sources

17 April 2010

The page about Oracle Text on Technet is nowadays hard to find. It is not found on the products page, it is not listed as an option in the database, but fortunately it is there, see http://www.oracle.com/technology/products/text/index.html. Is has pdf’s about the features of Oracle 11g, some white papers and a link to the sample codes. So it is still a good source to start with Oracle Text.

The other source for starting with Oracle text is on docs.oracle.com. Here you can find all the manuals for the products of Oracle based on versions. There are 2 manuals for Oracle Text in 11g:

Both give you an idea of the possibilities of Oracle Text.

Oracle 11g and Documentum (part2)

17 April 2010

By not setting the given parameters for Documentum according to their documentation, we need to be sure the statistics in Oracle should be up-to-date. Beside the normal stats on the tables and indexes, the standard night batch is ok, we need also to be sure that the system statistics are ok. This comes in two ways, with workload and no-workload. For the start no-workload is ok, later as in production, workload can be used on a busy system. So I started with the next statement, performed by SYS:

begin
  dbms_stats.gather_system_stats();
end;

After that some queries which went wrong, went better. So later I will perform the workload statement if we go to acceptance and production. Now back for searching the needed indexes, but I will be sure, our testers will also complain about performance, so it will not hard to find the missing ones.

Oracle 11g and Documentum

16 April 2010

In my current project my task is to migrate data from Oracle to a new Documentum environment. Beside they also asked me to keep baby sitting on the Oracle database lying under the Content Server of Documentum. The install was done by following the installation guide from Documentum. That is not that much for Oracle, just install and set some parameters. After that run some scripts and Oracle is ready to receive Documentum.

I looked at the database and saw those parameters:

  1. optimizer_index_cost_adj was 5
  2. optimizer_index_caching was 95
  3. optimizer_mode was choose

After some investigation I found out that these setting are common for a Documentum install in Oracle. It has been the best practice for more than 10 years and Documentum works this way. I made the next adjustments to the database, because the above settings come from before oracle 10g, which will be the standards for the next installation for Oracle for Documentum.

  1. optimizer_index_cost_adj back to 100
  2. optimizer_index_caching back to 0
  3. optimizer_mode to ALL_ROWS

Furthermore I set CURSOR_SHARING to FORCE. There is no way in the translation from DQL (The SQL language of Documentum) to SQL in which Documentum uses Bind Variables. So they made the variables hard-coded. So I made the decision to set CURSOR_SHARING. I tested also Similar, but then some queries went mad, mostly inserts and updates. So the safest setting is FORCE. Similar will also be desupported by Oracle in 11.2.

By default installation Documentum makes 2 tablespaces, one for data and one for indexes. But the default gives these a max size of 2GB. So after the first tests in the test environment, the tablespace of data became full. This is mostly due to the heavy auditing. This auditing happens in 1 table, DM_AUDITTRAIL_S. I made some adjustments by making another tablespace and put the table and indexes into it. This way the table can be easily monitored, by monitoring the tablespace.

The last thing what I now need to do, is monitoring the database for peaks. This is because Documentum can make new objects (which will lead to new tables), but will not make the indexes. So it is to the developer or the DBA to find these. They put me on this task. I keep monitoring and use the OEM for this. A look at the performance tab for a peak and also some jobs to find simple FULL TABLE scans. So some work to do.


Follow

Get every new post delivered to your Inbox.