Daily Log for #alfresco IRC Channel

Alfresco discussion and collaboration. Stick around a few hours after asking a question.

Official support for Enterprise subscribers: support.alfresco.com.

Joining the Channel:

Join in the conversation by getting an IRC client and connecting to #alfresco at Freenode. Our you can use the IRC web chat.

More information about the channel is in the wiki.

Getting Help

More help is available in this list of resources.

Daily Log for #alfresco

2019-10-30 07:47:00 GMT <alfresco-discord> <dgradecak> morning, what would be the possibilities to automatically deploy an application in activiti 5 (APS 1.9)? app = processes + forms

2019-10-30 10:16:50 GMT <alfresco-discord> <yreg> develop all of that logic on your own

2019-10-30 10:17:10 GMT <alfresco-discord> <yreg> possibly getting some inspiration from alfresco's own demo app

2019-10-30 10:17:29 GMT <alfresco-discord> <yreg> unfortunately that's one of the many shortcomings of APS

2019-10-30 10:17:57 GMT <alfresco-discord> <yreg> the same is also true for email templates, repos, and a lot of other config possibilities

2019-10-30 10:33:48 GMT <alfresco-discord> <yreg> @agouveia I have been looking at https://github.com/mintel/alfresco-gcs-connector and it seems really cool and very promising

2019-10-30 10:33:50 GMT <alfbot> Title:GitHub - mintel/alfresco-gcs-connector: Google cloud storage connector for Alfresco (at github.com)

2019-10-30 10:34:01 GMT <alfresco-discord> <yreg> thanks for sharing that out there with the community !!

2019-10-30 10:35:21 GMT <alfresco-discord> <yreg> the only remark I have is that commit messages have references to invisible/inaccessible issues, and I was wondering if it wouldn't be better to use github issues instead for increased transparence ๐Ÿ™‚

2019-10-30 10:37:10 GMT <alfresco-discord> <yreg> ah I also noticed the absence of any OS license there, and I am afraid many organisations would probably want to have that specified clearly before considering any involvement..

2019-10-30 11:36:30 GMT <alfresco-discord> <agouveia> @yreg thanks! The license is now there, was an oversight on my part. The commits were related to previous issues before deciding on going public, going forward we plan to use just github issues

2019-10-30 11:37:35 GMT <alfresco-discord> <yreg> Awesome!

2019-10-30 11:37:57 GMT <alfresco-discord> <yreg> Again, thanks for sharing this with the community!

2019-10-30 11:46:18 GMT <alfresco-discord> <dgradecak> @yreg I though that suche flavors would be available in "enterprise" ๐Ÿ˜‰ well developing on its own is the only solution indeed

2019-10-30 12:01:15 GMT <alfresco-discord> <yreg> @dgradecak you would be amazed...|| I had to do an extension in order to be able to send customised notifications (with placeholders for execution variables) which is pretty basic IMHO, and should be provided OOTB....||

2019-10-30 12:08:07 GMT <alfresco-discord> <dgradecak> well, I would not even discuss about if it was "org.activiti" but all these things are "com.activiti" and nothing is in there ๐Ÿ™‚ even the connectors for ACS are so rudimental

2019-10-30 12:09:23 GMT <alfresco-discord> <dgradecak> when using flowable for instance, I always wondered, I must be missing important things as Alfresco connectors from APS, but when I see what APS offers ...

2019-10-30 12:42:08 GMT <AFaust> @agouveia First of all, great to see alfresco-gcs-connector shared. Looking over it I do have a few concerns.

2019-10-30 12:42:38 GMT <AFaust> I don't yet know if those concerns could be addressed with simple PRs without changing too much in the current config (the addon conflicts with Alfresco's subsystem approach for content stores). If they can, I will file them in the project.

2019-10-30 12:44:17 GMT <AFaust> One thing I am interested in is adding support for additional stores in https://github.com/Acosix/alfresco-simple-content-stores - and ideally I would like to be able to do by linking to (optional) third party addons like yours, instead of re-implementing the same feature in my addon due to incompatibilities.

2019-10-30 12:44:18 GMT <alfbot> Title:GitHub - Acosix/alfresco-simple-content-stores: Addon to provide a set of common content store implementations and easy-to-use configuration (no Spring config) (at github.com)

2019-10-30 12:45:29 GMT <AFaust> Putting the project on my list of "things to check out in more detail ASAP" for the moment...

2019-10-30 12:49:02 GMT <alfresco-discord> <yreg> wondering how many items are hanging on that list, and on other lists, accumulating year after year

2019-10-30 12:50:16 GMT <alfresco-discord> <yreg> but having checked the simple stores implementation a while ago, and having peeked at this implementation now, I see the potential for enhancements...

2019-10-30 12:50:52 GMT <alfresco-discord> <yreg> and also huge potential for certain usecases

2019-10-30 12:53:28 GMT <alfresco-discord> <yreg> thinking cross cloud vendor content replication amongst other things

2019-10-30 12:53:28 GMT <AFaust> Since the list is more like a "on the back of my mind" one, the real number is hard to tell. As one of our ministers for the interior once said (when refusing to answer a question): "The answer to that question may raise your level of concern"

2019-10-30 12:53:51 GMT <AFaust> I generally joke that the number of items has four digits

2019-10-30 13:10:46 GMT <AFaust> angelborroy: Quick follow-up to that my bulk indexing complaints last week and the week before: it took ASS 1.1 ~8.5 days for the 100 million nodes (metadata-only). About half what ASS 1.3 was showing me as the estimated time remaining, and only ~2-3x of what ASS 1.1 showed as its initial estimation (after a few hours of work)

2019-10-30 13:11:50 GMT <angelborroy> Thanks for the feedback AFaust

2019-10-30 13:12:17 GMT <angelborroy> We are spiking that process to increase the speed, but nothing to share by now

2019-10-30 13:12:24 GMT <angelborroy> Any hints from your experience?

2019-10-30 13:12:56 GMT <alfresco-discord> <yreg> AFaust, was that about full re-index or indexing of huge transaction (change of name on a folder high in the hierarchy for instance)

2019-10-30 13:13:05 GMT <AFaust> I have restarted the ASS 1.3 indexing (from scratch) with identical parameters of ASS 1.1 to check if (after 1 day of work) its progress / estimation is similar to that of 1.1 - maybe there was something wonky in my various config attempts of optimisation...

2019-10-30 13:13:47 GMT <AFaust> yreg: It was a full initial index with transactions of max 51 nodes

2019-10-30 13:14:23 GMT <AFaust> No cascading operations, no ACLs (apart from default in an empty Alfresco system) - just plain bulk indexing

2019-10-30 13:14:25 GMT <alfresco-discord> <yreg> ouch, that's really slow! any idea on the number of ACLs ?

2019-10-30 13:15:05 GMT <alfresco-discord> <yreg> ok, then indeed you can rule out the usual suspects ...

2019-10-30 13:15:25 GMT <alfresco-discord> <yreg> have you tried to trace which operations are taking the longest ?

2019-10-30 13:15:48 GMT <angelborroy> There are some points we are investigating now:

2019-10-30 13:16:10 GMT <angelborroy> 1 - CascadeTracker is taking a lot of time (we are considering to remove this tracker)

2019-10-30 13:16:42 GMT <alfresco-discord> <yreg> I know you can easily tweak/extend the alfresco-zipkin-module to support service-level tracing (how much time this or that service invokation took), have you considered such a thing ?

2019-10-30 13:16:49 GMT <angelborroy> 2 - CommitTracker is taking a lot of time when segments must be merged (we are considering to perform commits more often)

2019-10-30 13:17:25 GMT <angelborroy> 3 - ACLTracker and ModelTracker seems to work fine in terms of performance

2019-10-30 13:17:48 GMT <AFaust> yreg: As far as my investigations went last week and the week before, most of the time is "wasted" waiting / being blocked in index merges + commits, especially since the aggressive-defensive code of Alfresco trackers will cause a huge load of "if an index document for that node already exists, delete it" queries, which are extremely expensive (need to re-calculate weights + uninvert fields on every commit to handle these)

2019-10-30 13:19:25 GMT <angelborroy> Main problem is that you cannot update a SOLR document

2019-10-30 13:19:30 GMT <angelborroy> So we can do nothing about that

2019-10-30 13:19:33 GMT <AFaust> angelborroy: It is not an issue of committing more often or less often.... in the curren code, if you commit too often, concurrent threads are never saturated with sufficient work to do before the "wait for commit" threshold is reached. If you commit less often, threads are blocked waiting for the expensive merge / commit.

2019-10-30 13:20:20 GMT <AFaust> You seriously need to change the index document handling to make deletion of (potential!) old index documents way more efficient.

2019-10-30 13:21:04 GMT <AFaust> And ideally offer a mode / config option for these bulk index use cases where you can say "don't try to delete old index documents".

2019-10-30 13:21:24 GMT <alfresco-discord> <yreg> Can it not be decoupled to the background, or to be done in bulks ?

2019-10-30 13:21:53 GMT <alfresco-discord> <yreg> and then have some de-duplication strategy when querying...

2019-10-30 13:22:21 GMT <alfresco-discord> <yreg> oh yeah that can result in false positives ...

2019-10-30 13:22:37 GMT <AFaust> ^^ or something like that, yes. The main point: Avoid having to run "delete queries" as part of the index commit / merge as best as possible.

2019-10-30 13:23:48 GMT <AFaust> I mean, bulk indexing is a special use case, so I could very well live with a mode I can trigger / configure with special parameters, e.g. a flag for "don't delete old index entries" and "index only until txn ID XX"

2019-10-30 13:24:14 GMT <AFaust> And after that, I switch to the default mode to catch up with the rest of the changes and have normal "day-to-day" tracking.

2019-10-30 13:24:49 GMT <AFaust> That would help everyone on every / any SOLR updates too, when re-indexes are required / recommended.

2019-10-30 13:25:56 GMT <alfresco-discord> <yreg> makes sense

2019-10-30 13:26:18 GMT <AFaust> angelborroy: for reference, this class (https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/DeleteByQueryWrapper.java) almost always came up when doing a jstack during those long wait times / synchronization points

2019-10-30 13:26:20 GMT <alfbot> Title:lucene-solr/DeleteByQueryWrapper.java at master ยท apache/lucene-solr ยท GitHub (at github.com)

2019-10-30 13:26:41 GMT <alfresco-discord> <yreg> having quickly googled solr document updates, it seems that solr had forseen ways to update index without having to recreate it no ? https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html

2019-10-30 13:26:42 GMT <alfbot> Title:Updating Parts of Documents | Apache Solr Reference Guide 6.6 (at lucene.apache.org)

2019-10-30 13:27:08 GMT <angelborroy> Good hint, Iโ€™ll take a look also on that

2019-10-30 13:28:52 GMT <angelborroy> We cannot apply that, yreg

2019-10-30 13:29:10 GMT <angelborroy> Our SOLR fields are not covering requisites to do that

2019-10-30 13:29:37 GMT <AFaust> ... and https://github.com/Alfresco/SearchServices/blob/master/search-services/alfresco-search/src/main/java/org/alfresco/solr/SolrInformationServer.java#L3707 is one of the ways that (indirectly) triggers those "delete by query" things

2019-10-30 13:29:38 GMT <alfbot> Title:SearchServices/SolrInformationServer.java at master ยท Alfresco/SearchServices ยท GitHub (at github.com)

2019-10-30 13:29:56 GMT <AFaust> ... called from metadata tracker for every indexed node

2019-10-30 13:30:50 GMT <AFaust> So, 100 million nodes => 200 million such delete queries (one for the main index document, and one for any ERROR index document placeholder)

2019-10-30 13:33:48 GMT <AFaust> *cough* - just checked the current ASS 1.3 estimate, and it shows "1 Year". That's the most pessimistic estimate I have seen so far. I hope that is just a temporary fluke after only 30 minutes of indexing.

2019-10-30 13:35:07 GMT <alfresco-discord> <yreg> I have seen worse, going up to almost 3 years ๐Ÿ˜„ and then it ended up taking "only" 3 to 4 days ...

2019-10-30 13:36:31 GMT <AFaust> Ok - I have never seen more than 3 months outside of real issues with the setup / infrastructure. Maybe I was lucky until now then...

2019-10-30 13:40:28 GMT <alfresco-discord> <yreg> TBH I have only seen that on one systems, hosted on windows, using MSSQL as DB, and having large number of affected documents due to bulk imports (migration) straight from the system first days in PROD ... and I think full text indexing wasn't off at the time ๐Ÿ˜„

2019-10-30 13:41:13 GMT <alfresco-discord> <yreg> so as you might have guessed, a lot of potential factors to inflate that estimate along with pessimism of the estimation algorithm

End of Daily Log

The other logs are at http://esplins.org/hash_alfresco