Difference between revisions of "ESciDoc Developer Workshop 14 15 07 2011"
Jump to navigation
Jump to search
Line 24: | Line 24: | ||
*current aproach is not bad, one document is generated with many file-highlight fields - but it has to be checked if the limit of the highlighted fields is 100 and about the performance issues | *current aproach is not bad, one document is generated with many file-highlight fields - but it has to be checked if the limit of the highlighted fields is 100 and about the performance issues | ||
**performance issues could be by caused highlighting also during indexing itself, but mostly by search performance itself | **performance issues could be by caused highlighting also during indexing itself, but mostly by search performance itself | ||
**more input from FIZ after analysis, as the problem is clear | |||
*Future development plans - short term roadmap on versions/features for release 1.4 to 2.0 | *Future development plans - short term roadmap on versions/features for release 1.4 to 2.0 |
Revision as of 11:44, 14 July 2011
Developer Workshop[edit]
- Date: July 14/15, 2011
- Place: Munich
- Previous workshop 22-23.09.2010, Karlsruhe
Participants MPDL[edit]
Participants FIZ[edit]
- Steffen Wagner
- Michael Hoppe
- Christian Herlambang
- Matthias Razum
Agenda 14.07.2011[edit]
Fulltext indexing[edit]
- enhanced with own xslt - questions from MPDL related to
- configuration of the search results output (rather than complete item/component/container)
- highlighting of search results (e.g. get the last page break tag)
- full text indexing for all FT visibility, searching according privileges and displaying snippets according privileges
- selective indexing from Admin tools
- incremental indexing
- Solr support and interfaces
Outcome on Fulltext indexing[edit]
- current aproach is not bad, one document is generated with many file-highlight fields - but it has to be checked if the limit of the highlighted fields is 100 and about the performance issues
- performance issues could be by caused highlighting also during indexing itself, but mostly by search performance itself
- more input from FIZ after analysis, as the problem is clear
- Future development plans - short term roadmap on versions/features for release 1.4 to 2.0
- critical: Internal managed vs. externally managed datastreams of MD records
- critical: namespace preservation bug in MD record
- Admin tool 1.3 offers only repository information
- Digilib integration
- plans , ideas, replacements?
- Scalability&Performance
- creation of items (MPDL provides some numbers)
- reindexing
- statistics - other store - faster and not dependent on escidoc-core and fedora?
- stress testing, mass data generation, monitoring of core service - how is done internally at FIZ with reference to FIZ Fedora Performance and Scalability Wiki
- JBoss, other AS, Tomcat
- supporting newer versions of JBoss
- support for other AS
- Tomcat
- LTA Long term archiving
- workflow settings (content model, context?)
- items immediately released (with proper indexing afterwards) -> one call to the service
- item event log (update, insert comments?)
- Content Models
- Any plans from MPDL side?
- Pragmatic and iterative approach - some ideas
- SPO