Alfresco discussion and collaboration. Stick around a few hours after asking a question.
Official support for Enterprise subscribers: support.alfresco.com.
Join in the conversation by getting an IRC client and connecting to #alfresco at Freenode. Our you can use the IRC web chat.
More information about the channel is in the wiki.
More help is available in this list of resources.
2019-12-06 09:07:40 GMT <hi-ko> Hi AFaust, did you also observe that ASS 1.4 by factors (2.5x-5x) slower in search performance than solr4 if running under same conditions (memory, cpu, caches, disk)?
2019-12-06 09:08:50 GMT <hi-ko> strange is that cpu and disk get bored
2019-12-06 09:10:31 GMT <AFaust> Did not have a system yet where SOLR 4 was replaced with ASS for direct comparison. Most long term customers still on 5.2 with SOLR 4. Only newer customers on 6.x with ASS 1.2/1.3 mostly.
2019-12-06 09:12:24 GMT <AFaust> But in my scaling tests I can not say that there is any such drastic slow down compared to what I have observed with SOLR 4 over the years
2019-12-06 09:12:54 GMT <AFaust> But these scaling tests never ran on SOLR 4, so no hard comparison data
2019-12-06 09:13:12 GMT <hi-ko> strange thing is, that the new ASS service does not consume the available cpus and memory. the sytem is most of the time idle on disk and cpu but the searches now takes now ~10-20 sec which took 2-5 sec in 5.2/solr4
2019-12-06 09:14:21 GMT <hi-ko> I will now increase caches and in the next step reindex with shards. system has (only) ~30 mio nodes
2019-12-06 09:15:16 GMT <hi-ko> I also compared openjdk8/openjdk11/oraclejdk8 with no difference
2019-12-06 09:17:12 GMT <AFaust> My scaling test ASS 1.4 + 100+m nodes runs queries in in ~5s once cached. Initial query is 50-120s mostly because of facets, filter queries and sort requested (IO cost of initial index loading)
2019-12-06 09:17:44 GMT <AFaust> Not increased caches yet, but may have to in order to prewarm 100s of common filter queries
2019-12-06 09:19:00 GMT <AFaust> the 5s above was for unsharded - the sharding I mentioned previously is meant to improve that worst case initial load time through smsller indexes + parallelism
2019-12-06 09:19:17 GMT <hi-ko> I doubled jvm heap from 12 to 24 gb but it looks like the caches are only kept for a very short time
2019-12-06 09:20:37 GMT <AFaust> caches are kept until new searcher is instantiated (after certain amounts of index updates / commits), then only a fraction of entries are by default carried over / refreshed
2019-12-06 09:20:53 GMT <AFaust> see autoWarmCount config parameters
2019-12-06 09:21:04 GMT <hi-ko> storage is ssd san with > 500 MB/sec but solr only reads ~20 MB s
2019-12-06 09:23:15 GMT <AFaust> it's usually limited by iops/s due to small, non contiguous reads not bandwidth / throughput
2019-12-06 09:30:53 GMT <hi-ko> solrcore.properties are more or less default and only slightly increased: https://gist.github.com/hi-ko/10f378d28269116a83bc84ef72ac06b2
2019-12-06 09:30:54 GMT <alfbot> Title:solr snippets · GitHub (at gist.github.com)
2019-12-06 09:32:11 GMT <hi-ko> which worked quite ok with solr4. I didn't identify changes in ASS 1.3-1-4
2019-12-06 09:40:18 GMT <AFaust> did you already identify specific, low-level queries which are more affected than others? I would not expect that 2.5x-5x to be across the board, so pinpointing what specifically is affected would be a good starting point for further investigation / considerations
2019-12-06 09:41:38 GMT <hi-ko> we do just simple searches using different singele search words in sequence - no complicate stuff
2019-12-06 09:42:53 GMT <hi-ko> maybe it has something todo with the translation changes in ASS
2019-12-06 09:46:56 GMT <hi-ko> sol6 index is ~ 10% smaller so didn't expect that degree of performance impact
2019-12-06 09:54:57 GMT <AFaust> Actually, translation / multi-lingual handling is no different in ASS - it is only that ASS always comes with default shared.properties that overrides the defaults in Java code. SOLR 4 did not have a default shared.properties (only sample) until some Alfresco 5.2 Enterprise Service Pack
2019-12-06 09:55:33 GMT <AFaust> e.g. SOLR 4 in Alfresco 5.2.4 behaves the same as ASS with regards to translation / multi-lingual handling
2019-12-06 09:55:57 GMT <AFaust> And if the sample shared.properties were enabled in earlier SOLR 4 instances, they would behave the same as well...
2019-12-06 09:56:44 GMT <hi-ko> so you guess that's not the reason why the systems differ
2019-12-06 09:59:30 GMT <hi-ko> do you know a way to replay the search in solr admin ui alfresco send to make measure the time spend in solr only? e.g. name=xxx or title=xxx or description=xxx
2019-12-06 10:00:58 GMT <AFaust> I typically use the Query tool in admin UI and log output from SolrQueryHTTPClient
2019-12-06 10:02:14 GMT <AFaust> Won't be able to replay 100% due to Query tool lacking support to pass various special POST body JSON elements which Alfresco FTS query parser uses, but has so far never prevented me from identifying roots of problems
2019-12-06 10:02:39 GMT <AFaust> Can of course always curl recorded HTTP POST query requests...
2019-12-06 12:39:42 GMT <alfresco-discord> <monica> Hello everyone. I am trying alfresco 6.2 and its running but I can't find any option to create site. How do I create a new site ?.Please guide.
2019-12-06 12:54:59 GMT <AFaust> I assume you are using the Share interface, correct? Then you should have to option to create a site via the "Sites" menu item in the black menu at the top of the page after login...
2019-12-06 13:42:45 GMT <hi-ko> the most obcious difference between the slow solr6 and the fast sol4 version is can see is that sorl4 really uses the memory in the jvm while solr6 only uses ~6.5 GB
2019-12-06 13:43:03 GMT <hi-ko> s/obcious/obvious/
2019-12-06 13:44:15 GMT <hi-ko> comparing the config in solr-admin-gui doesn't show any difference for me
2019-12-06 14:35:36 GMT <hi-ko> AFaust I made some tests curling simple queries captured from SolrQueryHTTPClient debug and the identical query is really taking that long. I think the main issue is that solr doesn't take the memory available: hhttp://files.ecm4u.de/s/5xKBxL6aktmzKgH
2019-12-06 14:35:38 GMT <alfbot> Title:Nextcloud (at files.ecm4u.de)
2019-12-06 14:48:40 GMT <AFaust> Did you check the cache statistics for the targeted core? Compare them to SOLR4 in the same usage scenario?
2019-12-06 14:48:56 GMT <AFaust> Might be that the default Lucene field cache configuration has changed.
2019-12-06 14:50:00 GMT <AFaust> Field cache is an extremely low-level cache not configured via solrcore.properties - it can be configured via solrconfig.xml, but AFAIK Alfresco does not have any active config in there as well...
2019-12-06 15:17:43 GMT <hi-ko> AFaust in sol6 the fieldCache shows 99 entries having
2019-12-06 15:17:43 GMT <hi-ko> total_size 33.6 MB
2019-12-06 15:17:43 GMT <hi-ko> . the solr4 cache stats contain much more and larger entries
2019-12-06 15:18:20 GMT <hi-ko> but couldn't find any config in solrconfig.xml in solr6
2019-12-06 15:20:19 GMT <AFaust> Look for fieldValueCache
2019-12-06 15:20:36 GMT <AFaust> the name of the runtime cache does not always match 100% with the name in XML
2019-12-06 15:23:05 GMT <AFaust> If not configured explicitly, the default in SOLR6 code is size of 10000, initialSize of 10, and showItems of -1 (meaning unbounded)
2019-12-06 15:23:35 GMT <AFaust> Interestingly (looking at code) those caches also support an maxRamMB option (default: unbounded)
2019-12-06 15:23:54 GMT <hi-ko> they are all commented (default alfresco)
2019-12-06 15:24:14 GMT <AFaust> That's what I meant with "Alfresco does not have any active config"
2019-12-06 15:24:26 GMT <AFaust> So it should fall back to the SOLR defaults
2019-12-06 15:24:35 GMT <hi-ko> what I found googleing are changes lucene Fields vs. DocValues between solr4 and solr6
2019-12-06 15:24:52 GMT <AFaust> Yeah, but DocValues are mostly for faceting
2019-12-06 15:25:23 GMT <AFaust> Alfresco will mostly only create docValues fields when model contains facetable option set on true
2019-12-06 15:30:05 GMT <hi-ko> index is ~100GB and if result is not cached a simple search takes 7-30 sec. having a very fast storage
2019-12-06 15:31:08 GMT <hi-ko> solr does never take more than 6-7 GB in jvm
2019-12-06 17:22:56 GMT *** mmccarthy1 is now known as mmccarthy
The other logs are at http://esplins.org/hash_alfresco