Daily Log for #alfresco IRC Channel

Alfresco discussion and collaboration. Stick around a few hours after asking a question.

Official support for Enterprise subscribers: support.alfresco.com.

Joining the Channel:

Join in the conversation by getting an IRC client and connecting to #alfresco at Freenode. Our you can use the IRC web chat.

More information about the channel is in the wiki.

Getting Help

More help is available in this list of resources.

Daily Log for #alfresco

2020-02-10 10:13:36 GMT <alfresco-discord> <EddieMay> @MorganP Can you send me your details/email that you used for University (eddie.may@alfresco.com) - it looks like access has expired rather than being anything to do w/ Alfresco ID.

2020-02-10 10:52:43 GMT <alfresco-discord> <EddieMay> > @EddieMay no "early bird tickets" this time? @dgradecak These are the early bird ticket prices

2020-02-10 12:33:51 GMT <hi-ko> Hi, I have a solr question: is there an easy way to report the biggest index nodes in the index? The index size growed dramatically and I would like to find out which docs are filling up the index space ...

2020-02-10 12:39:51 GMT <angelborroy> You can try something like this:

2020-02-10 12:39:52 GMT <angelborroy>*&sort=cm:content.size%20desc&wt=json

2020-02-10 12:40:50 GMT <AFaust> I think he means which index document consumes the most space, not necessarily which node has the largest content size (which may not be fully indexed)

2020-02-10 12:41:20 GMT <AFaust> e.g. a 6 GiB image will have a large cm:content.size but not consume much space as an indexed node if it does not have much metadata etc.

2020-02-10 12:42:17 GMT <alfresco-discord> <yreg> but honestly, usually there is somewhat of a correlation between both ...

2020-02-10 12:42:33 GMT <AFaust> well ... for plain text, that may be true

2020-02-10 12:42:52 GMT <AFaust> but as soon as you include compressed / binary formats in there, not so much

2020-02-10 12:42:59 GMT <alfresco-discord> <yreg> For text/office/pdf documents in general

2020-02-10 12:43:17 GMT <AFaust> anyway... I would simply look for the biggest *.tgz files in the SOLR content cache...

2020-02-10 12:43:46 GMT <AFaust> since these contain both indexable content and metadata, they should be good indicators

2020-02-10 12:43:47 GMT <hi-ko> AFaust: exactly ;-)

2020-02-10 12:44:23 GMT <hi-ko> AFaust: nice idea with the db files - I'll check ...

2020-02-10 12:44:42 GMT <alfresco-discord> <yreg> true

2020-02-10 12:44:43 GMT <angelborroy> Be aware… we are removing that in SS 1.5 ;-)

2020-02-10 12:45:10 GMT <AFaust> Well, MS Office formats are also compressed as ZIP files, so somewhat distorted. And they can contain a lot of indexing irrelevant format directives, metadata, links etc.

2020-02-10 12:45:35 GMT <AFaust> angelborroy: When talking to Germans, it is better to use ASS, instead of SS... just saying

2020-02-10 12:45:59 GMT <alfresco-discord> <yreg> > angelborroy: When talking to Germans, it is better to use ASS, instead of SS... just saying @AFaust#0000 🤣

2020-02-10 12:46:35 GMT <hi-ko> angelborroy: and what about english speakig guys and ass? ;-)

2020-02-10 12:46:57 GMT <AFaust> Kristen once told me that internally there was a directive to use ASMS...

2020-02-10 12:47:06 GMT <AFaust> Alfresco Search Management Services, or something like this

2020-02-10 12:47:08 GMT <angelborroy> Internally we have been baned to use “ASS”

2020-02-10 12:47:22 GMT <angelborroy> We are using now SS and SIE

2020-02-10 12:49:07 GMT <AFaust> But I would say, even for English speakers, ASS is preferable to SS... the former may just be slightly naughty, but the latter was classified as a criminal organisation in one of the first allied occupation authority proclamations...

2020-02-10 12:49:59 GMT <AFaust> angelborroy: Oh, SIE... how formal.

2020-02-10 12:52:30 GMT * AFaust is considering to use TSFKAASS to be completely independent of the name Alfresco is currently using internally, and not have to change it every other month

2020-02-10 12:57:32 GMT <alfresco-discord> <yreg> The Search F** Known Also as ASS ?

2020-02-10 13:13:53 GMT <AFaust> The service formerly known as Alfresco Search Services...

2020-02-10 13:18:59 GMT <alfresco-discord> <yreg> 😛 I almost got it right the first time

2020-02-10 13:19:29 GMT <alfresco-discord> <yreg> I was just wondering what could the F stand for

2020-02-10 13:41:26 GMT <hi-ko> strange: a 'du' on solr/content/_DEFAULT_/db returns: 78G but the solr index is ~400 GB after reindexing 50% transactions. So this may be caused by other stuff like using secondaries for large directories ...

2020-02-10 13:46:29 GMT <AFaust> Reindexing from scratch or reindexing with existing index using an admin URL action?

2020-02-10 13:47:12 GMT <hi-ko> from scratch because I had doubts in index consistencies

2020-02-10 14:23:40 GMT <hi-ko> AFaust, angelborroy: do you know if solr is creating a full lucene doc for every secondary assoc?

2020-02-10 14:24:36 GMT <angelborroy> Sample of secondary assoc?

2020-02-10 14:24:38 GMT <hi-ko> or even worse: n lucene docs for every child of a secondary assoc of a cm:folder

2020-02-10 14:24:50 GMT <angelborroy> What is a secondary assoc?

2020-02-10 14:25:03 GMT <hi-ko> socondary child assoc

2020-02-10 14:25:12 GMT <AFaust> Old SOLR used to do that for handling PATH, but newer SOLR 4 + ASS should not AFAIK

2020-02-10 14:25:42 GMT <angelborroy> No, SOLR 6 is not storing childs as SOLR Docs

2020-02-10 14:26:34 GMT <hi-ko> hmm - thanks. So still no idea for that large index size.

2020-02-10 14:37:27 GMT <alfresco-discord> <yreg> hi-ko is there a chance you would be having a lot of breaking inheritance

2020-02-10 14:37:34 GMT <alfresco-discord> <yreg> and node-level permissions

2020-02-10 14:37:35 GMT <alfresco-discord> <yreg> ?

2020-02-10 14:37:48 GMT <alfresco-discord> <yreg> I know that can be a huge pain

2020-02-10 14:38:11 GMT <alfresco-discord> <yreg> I have a client with twice as many ACL/ACE docs as actual node docs in solr

2020-02-10 14:38:17 GMT <hi-ko> yreg: I have to check that.

2020-02-10 14:38:41 GMT <hi-ko> Why do you think many acls could blow up the index?

2020-02-10 14:39:30 GMT <alfresco-discord> <yreg> I know it for a fact from a client I have, but maybe angelborroy can shed some lights on the why ..

2020-02-10 14:40:57 GMT <hi-ko> This system has lots of recursive groups which may result in lots of ace rows. Let me check ...

2020-02-10 14:43:23 GMT <hi-ko> SELECT COUNT(*) FROM alf_access_control_list WHERE inherits IS false: 11372

2020-02-10 15:07:45 GMT <alfresco-discord> <yreg> I don't think that's a lot

2020-02-10 15:07:51 GMT <alfresco-discord> <yreg> seems quite reasonable

2020-02-10 15:09:01 GMT <hi-ko> many of them are user homes

2020-02-10 15:10:17 GMT <alfresco-discord> <yreg> It can't be what I suspected, the client I told you about had a behaviour to break inheritance and set permissions based on attributes programmatically

2020-02-10 15:10:52 GMT <alfresco-discord> <yreg> they have 2M+ nodes affected

2020-02-10 15:11:20 GMT <hi-ko> huii

2020-02-10 15:11:48 GMT <hi-ko> so they had/have a lot of acls!

2020-02-10 15:12:53 GMT <alfresco-discord> <yreg> occupying twice as many in solr index as node docs, probably also in size since only a selection of content types had FTS enabled

2020-02-10 16:06:51 GMT <fwu2018> hello all

2020-02-10 16:07:41 GMT <phaleth> hi

2020-02-10 16:09:14 GMT <fwu2018> anyone using the rest api (alfresco 5.2) to update tasks? Im getting 14-20 seconds to update a task. Shouldnt these values be much better than this?

2020-02-10 16:09:54 GMT <fwu2018> im using this: alfresco/api/-default-/public/workflow/versions/1/tasks/115743?select=state,variables&alf_ticket=TICKET_82a211d3e1ee4b8ed570f0aad2d0c12f7096dc3a

2020-02-10 16:11:05 GMT <fwu2018> Im updating more or less 15 variable values in this request

2020-02-10 17:56:31 GMT <fwu2018> brb

End of Daily Log

The other logs are at http://esplins.org/hash_alfresco