Call for Participation

2nd International CIKM Workshop on Patent Information Retrieval - PaIR'09
November 6, 2009
Hong Kong (colocated with CIKM 2009)

Workshop homepage:

Program and Registration

The titles of the accepted contributions are available (see below).
Deadline for early-bird registration has been extended to September 3. Please see the CIKM Web site for details.


Patent Information Retrieval specialists in the 21st century face many challenges. They must search very large numbers of documents in multiple languages expressing complex technological concepts through sophisticated legal clauses. Despite a great deal of theoretical development in Information Retrieval techniques, advanced search tools for patent professionals are still in their infancy.

The objective of the workshop is to provide a forum for Information Retrieval and Knowledge Management scientists as well as Patent Retrieval experts from industry to study the next generation of patent search tools. We encourage IP professionals to present their special information needs and IR&KM researchers to present relevant technical ideas, for example for high recall search in prior art searching.

We also promote exchange of ideas on measuring the progress of system performance for retrieval tasks in the intellectual property domain.



  • (tbd)
    Noriko Kando, National Institute of Informatics, JP
  • From patent information retrieval to semantic data mining
    Ilkka Havukkala, Intellectual Property Office of New Zealand, NZ

Paper Presentations

  • A Case for Probabilistic Logic for Scalable Patent Retrieval
    Iraklis A. Klampanos, Hany Azzam and Thomas Roelleke
  • Identification of Low/High Retrievable Patents using Content-Based Features
    Shariq Bashir and Andreas Rauber
  • Extracting Problem Solved Concepts from Patent Documents
    Shahzad Tiwana and Ellis Horowitz
  • Phrase-based Document Categorization revisited
    Cornelis H.A. Koster and Jean G. Beney
  • A Design Rationale Representation Model Using Patent Documents
    Ying Liu

Poster Presentations

  • Interactive Constrained Clustering for Patent Document Set
    Yusuke Sato and Makoto Iwayama
  • Automatic Translation of Scholarly Terms into Patent Terms
    Hidetsugu Nanba, Hideaki Kamaya, Toshiyuki Takezawa, Manabu Okumura, Akihiro Shinmori and Hidekazu Tanigawa
  • Using normalized alignment scores to detect incorrectly aligned segments
    Andreas Türk
  • On the Role of Classification in Patent Invalidity Searches
    Christopher Harris, Steven Foster, Robert Arens and Padmini Srinivasan
  • Patent Claim Decomposition for Improved Information Extraction
    Peter Parapatics and Michael Dittenbach
  • FindCite - Automatically Finding Prior Art Patents
    Shahzad Tiwana and Ellis Horowitz

Important Dates

  • Early-bird registration deadline: August 27, 2009
  • Workshop date: November 6, 2009



General Chair

  • John Tait, Information Retrieval Facility, AT

Workshop Organizers

  • Helmut Berger, Matrixware Information Services GmbH, AT
  • Michael Dittenbach, Matrixware Information Services GmbH, AT
  • Mihai Lupu, Information Retrieval Facility, AT

PR & Communications Chair

  • Marie-Pierre Garnier, Information Retrieval Facility, AT

Program Committee

  • Giambattista Amati, Fondazione Ugo Bordoni, IT
  • Leif Azzopardi, University of Glasgow, UK
  • Helmut Berger, Matrixware Information Services GmbH, AT
  • Bruce Croft, University of Massachusetts Amherst, US
  • Hamish Cunningham, University of Sheffield, UK
  • Barrou Diallo, European Patent Office, DE
  • Michael Dittenbach, Matrixware Information Services GmbH, AT
  • Karl Fröschl, University of Vienna, AT
  • Gerben Gieling, Synthon BV, NL
  • Cornelis H.A. Koster, University of Nijmegen, NL
  • Wessel Kraaij, TNO, NL
  • Birger Larsen, Royal School of Library and Information Science, DK
  • Mihai Lupu, Information Retrieval Facility, AT
  • R. Manmatha, University of Massachusetts Amherst, US
  • Andreas Pesenhofer, Matrixware Information Services GmbH, AT
  • Andreas Rauber, Vienna University of Technology, AT
  • Giovanna Roda, Matrixware Information Services GmbH, AT
  • Thomas Roelleke, Queen Mary University of London, UK
  • Michael Schwantner, FIZ Karlsruhe, DE
  • Wim Vanderbauwhede, University of Glasgow, UK
  • Jun Wang, University College London, UK
  • Jianhan Zhu, University College London, UK


  1. The scheduling of PaIR'09 in conflict with the world's largest patent information fair ( demonstrates the wide gulf between the information retrieval and patent information user communities and the distain of the former for the latter.

    1. In this particular case, the scheduling has nothing to do with disdain (if that's what you mean) of any kind, it's just bad luck. The PaIR workshop is traditionally co-located with the CIKM Conference on Computational Intelligence and Knowledge Management, because of the appropriate conference topics covering an interesting mix of databases, information retrieval and data mining. So, we have no influence on the date.


      1. I apologize for using the word "distain." The two groups simply inhabit separate worlds.

      2. What do you mean by "traditional?" It's only your second workshop. What would happen if you colocated with a patent user meeting? Answering "no participation" would highlight the syndrome.

        1. Michael may want to respond as well but I think I can also shed some light on the motives here.

          There are two groups of professionals here, the information retrieval folks and the intellectual property folks.  The IP people can take advantage of what the IR people have to offer if it leads to improved searching tools that make the IP person's job easier, more productive or more insightful.  The IR people can be attracted to take on the challenge of patent information because of the difficulties of working with this particular corpus.

          An organization like the IRF needs to extend outreach to both communities so they have made the rounds this year to IPI, PIUG and ICIC (last year) to engage the IP community more deeply.  To extend into the IR community in order to spread the message about how important and interesting it is to work with patent information that have decided to co-locate a symposium with CIKM which is a large conference focused on search and IR.  CIKM is sponsored by Google for instance.  So in essence PaIR is more for the benefit of the IR community than it is for the IP community.  The IRF still extends the offer to the IP community to participate if they like.

          Perhaps it makes sense for the IRF to work with some of the patent information meetings to offer an IR session for IP people as a way to help cross-polination.  They did this at the last IRF Symposium and the 2-hour session on IR for IP was heavily attended by the IP community in attendance and was appreciated for the view it provided into the IR world.  For the record there was a similar IP for IR session where the IR people got a look into the issues IP professionals face in searching with the existing collection of tools.  This session helped the IR people understand the challenge and expectations of the IP community.



          1. Thanks Tony for this excellent wrap-up of what this is all about. Given the background of the people involved in organizing PaIR, the workshop is admittedly more on the computer science side and therefore embedded in the typical computer science conference setting (CIKM). But with this event we try to bridge the gap by explicitly asking the IR, knowledge management, database, semantic systems, machine translation, etc. research communities to work with patent data and on problems specific to patent search. Furthermore, we invite IP people to talk about the "unpleasant" issues they are confronted with on a daily basis, which might be solvable with state-of-the-art computer science methods. In doing so, we are engaging both sides in a meaningful dialogue from which everybody can learn something and progress can be made. Besides Tony's keynote speech last year, we had a great paper and talk by Kris Atkinson from Boston Scientific that gave the computer scientists deep insight into the life of a patent searcher. There were also a lot of people from the IP community in the audience. Sherri Voebel, for example, was very positive about the workshop.

            Refocussing on the actual topic of this Confluence page, I'd like to invite the people from the PIUG community to contribute to this workshop by submitting a paper. There are still two weeks left. And in case you can't make it this year, we are looking forward to seeing you at the "legendary" Third Workshop on Patent Information Retrieval co-located with the CIKM 2010 in Toronto, Canada, Oct. 25-29.


  2. Personally, I have never heard of this event which, according to Alan's logic, means I must harbor distain for the patent information community.

    Do you know what might have been nice, to hear more about what appears to be a major patent information conference in a part of the world few of us have had an opportunity to visit and would love to know more about.

    It would also be nice to hear more about the organization running this conference and how they are similar or dissimilar to the patent information communities in the US and Europe.

    If only there was someone who was familar with both communities and who could act as a liason between the two building bridges that would benefit both communities.

    Instead we get stones thrown when an organization is making a genuine effort to try and build a bridge between two communities (IR and IP in this case).

    Scheduling under the best of circumstances is exceedingly difficult and conflicts are inevitable.  Trying to do so in conjunction with a large collection of users where our sub-class is a very small representation with little opportunity for input is almost impossible.

    So Alan, why don't you help all of us and say more about this upcoming meeting and the people behind it.



    1. 18000 patent professionals attend over 3 days. This has been going on for decades.

      Landon IP exhibits. LexisNexis exhibits. East Linden exhibits. Thompson exhibits. WIPO exhibits. The USPTO exhibits. The EPO exhibits.

      To paraphrase one major Western exhibitor, "I wish I could bring just a few PIUGers to this show. The scale and crowd would boggle their minds."

      1. If the organizers of the world's largest patent information fair published information in a language other than Japanese, Europeans and Americans would be more aware of the fair and better able to avoid conflicts and perhaps attend. Unfortunately, the website doesn't have an English counterpart, so most of us haven't heard about it.

        Please tell us more and suggest to the organizers that they publicize the fair to people in the rest of the world.

        1. Edlyn, You are writing about a different gulf.

          The PaIR vs PIFC gulf is between a group that seeks to bring IR to patent informaton and a very large body of patent information users. That the organizers of PaIR were either oblivious to or chose to ignore a decades-old - I first attended PIFC in the 1980s - gathering of close to twenty thousand patent information professionals is symptomatic of the breadth of this gulf.

          The gulf you are writing about is between Asian patent information users - PIFC attracts many Korean and Chinese professionals - and a subset of Western patent information users. Vendors and patent offices have obviously bridged this gulf. At the moment, I can't think of a clear benefit to either PIFC or PIUG in bridging this at the end user level. (Thomson, LexisNexis and Landon IP are already teaching them everything PIUGers know.)

      2. Alan,

        Thank you for this brief introduction.  From the limited opportunities I have had to visit the region and from conversations I have had with practitioners I have always heard that users in Asia (especially the Japanese) are much more interested in patent analytics than their Western counterparts.  I have also seen and heard that patent information in general is more highly regarded by Asian business managers and it plays a larger role in strategic planning in that part of the world.  Having said this it is not a surprise that an event of this sort is so large and important.

        It would really be helpful if you can help us by translating some of the particulars around this conference such as what sort of patent professionals attend (attorneys, business people, etc...) and what sort of activities they engage in.  A conference of this size would make sense if it incorporated licensing, M&A, prosecution and other activities that involve patents.  As all of us are aware, here in the US these activities are handled by several different organizations, AIPLA, IPO, LES, ectc...

        So let me stress again that any information that you can provide including some contacts who would be willing to speak with a Western audience on this conference would be highly appreciated.

        1. Tony, Thank you.

          This is a large task so I have to beg off. The Japan Science & Technology Agency has hired me on the side as an internationalization coach for their technology transfer office. Class materials are due today.

          If you are planning to attend PaIR'09, stop in at PIFC on your way there.

          1. Edlyn/Alan,

            [This comment has been reproduced as a new PIUG-DF topic: Japan Patent Information Fair and Conference (PIFC 2009). Please continue the conversation about PIFC 2009 there.]

            First, for a reference:

            2007 and 2009 Japan Patent Information Fair and Conference (PIFC) were listed of the Patent information meeting on our web-site in 2007 anf 2008

   missed 2009 PIFC in my wiki updates, Alan added information on May 29, 2009 (v. 106). Information about PaIR'09 was added by Michael Dittenbach on March 23, 2009 (v.64). do not know what we can do to avoid third party event conflicts.

            Secondly, reiteration of Tony's question:

            It would be nice to here from you or other Japan-based PIUG members what PIFC is about. In the past, I used machine translation over Japanese languague, and got an impression that PIFC is mostly a big Expo with a 1-day excellent appear-to-be English language symposium with talks from EPO etc. (I have been afraid that PIUG Meeting by number of commercial talks could look in future as PIFC now). What other is included in the conference part of the PIFC event.

            Finally, my questions are on intersect of information retrieval and patent information communities in Japan.

            Japan was the only country (until 2009) which run years patent information retrieval evaluation tasks (NTCIR, NII Test Collection for IR Systems) (see "The development of an information retrieval system of research papers and patents for academic researchers is central to the Intellectual Property Strategic Programs for2006,2007, and2008of the Intellectual Property Strategy Headquarters in the Cabinet Office, Japan"Could anybody brief, if and how NTCIR and/or NII is presented at the PIFC. What commercial products which apply advanced information retrieval science have been developed in Japan presented at PIFC (cf. East Linden was developed



        2. An event that has attracted 18,000 attendees annually for 20+ years obviously makes sense, whatever its content.

          PIFC is a trade show with conference on the side. Most exhibitors show patent search and analysis services and tools. A few patent translation companies exhibit. Licensing, M&A and prosecution are not part of it. The language of the show is Japanese. (I have attended this off and on since 1987 and joined SiriusSoft's exhibit and seminar in 2007.)

          There is another conference for patent searchers, InfoPro, that is held each January. InfoPro is sponsored by the Japan Science & Technology Agency. (I have not attended this.) (I have a new part time position as internationalization coach in JST's Innovation Strategy Headquarters.)

          NTCIR is a copy of TREC. It is basically a herd of academics happily drinking from a spring of money. If practical services are coming out of it, they are likely going into the JPO's operations, for example, for preclassifying patent applications. NTCIR is held every 18 months; the next conference will be in June 2010. The language of the meetings is English. (I attended NTCIR last December.)

          Prosecution content is part of the Intellectual Property Association of Japan (IPAJ). Their annual meeting is held in May or June and also covers broader content such as an XML markup language for claims. Michael Phelps was a keynote speaker this year. The language of the meetings is Japanese. (I am a member of the Asian Innovation Committee whose primary objective is to harmonize patent protection for medical devices in Japan, Korea and China.)

          SMIPS, Strategic Management for Intellectual Properties, is a ragtag networking group that meets 1 pm to 8 pm on the third Saturday of each month at a local college. (The meeting time accomodates faculty, students and professionals: the professionals' ages range from 26 to 75.) With emphasis on mentoring younger professionals, meetings follow a 3-1-3 format: three concurrent group meetings followed by a keynote followed by three concurrent group meetings. The groups are a hodgepodge of special interests: licensing associates, patent law case studies, children's books on IP, agribusiness, patent claims markup, IP education, and negotiation role playing. The September meeting is devoted to career development. The language of the meetings is Japanese. (I participate in the licensing associates, IP education and claims markup groups, hosted a SMIPS representative in Paterra's booth at AUTM in March, and will keynote the September meeting.)