Alfresco discussion and collaboration. Stick around a few hours after asking a question.
Official support for Enterprise subscribers: support.alfresco.com.
Join in the conversation by getting an IRC client and connecting to #alfresco at Freenode. Our you can use the IRC web chat.
More information about the channel is in the wiki.
More help is available in this list of resources.
2017-12-18 06:09:43 GMT <ankita> hi everyone
2017-12-18 06:10:15 GMT <ankita> I am following https://addons.alfresco.com/addons/capturesco-alfresco-scanning-tool
2017-12-18 06:10:17 GMT <alfbot> Title: Capturesco - Alfresco scanning tool | Alfresco Add-ons - Alfresco Customizations (at addons.alfresco.com)
2017-12-18 06:11:20 GMT <ankita> and tring to integrate the capturesco open source tool which captures the documents during scanning and upload them directly to alfresco
2017-12-18 06:12:46 GMT <ankita> so,what I have done is that I have run the Capturesco 2.0.exe file
2017-12-18 06:13:23 GMT <ankita> On running this file,it is asking for username and password
2017-12-18 06:14:47 GMT <ankita> so I have entered the same user name and password that I have applied when running the alfresco
2017-12-18 06:15:16 GMT <ankita> but still it is prompting that the login credentials are wrong
2017-12-18 06:15:31 GMT <ankita> I am not able figure out this
2017-12-18 06:15:48 GMT <ankita> any help would be greatly appreciated
2017-12-18 06:33:29 GMT <divya> I am following the link Capturesco - Alfresco scanning tool | Alfresco Add-ons - Alfresco Customizations to integrate the capturesco into alfresco that captures the documents during scanning and uploads them directly to alfresco. I have started the alfresco and when I run the file Capturesco 2.0.exe directly from the bin folder in capturesco2\TWAINCapture\bin\Release,it is asking for the user name and password.So,I have entered th
2017-12-18 06:33:40 GMT <divya> I have entered while installing the alfresco,but it is still saying that "Invalid Username/password"
2017-12-18 06:33:57 GMT <divya> I am not able to figure out why this is so.. any idea regarding this would be greatly appreciated
2017-12-18 07:15:43 GMT *** ChanServ sets mode: +o fcorti
2017-12-18 07:16:45 GMT *** fcorti changes topic to "Alfresco discussion and collaboration. Stick around a few hours after asking a question. Logs: http://chat.alfresco.com Channel help: https://community.alfresco.com/ Official support for Enterprise subscribers: http://support.alfresco.com. Next event is DevCon in Lisbon the 16-17-18th of January: http://devcon.alfresco.com/"
2017-12-18 07:19:05 GMT <twen> good snowy morning
2017-12-18 07:48:15 GMT <Tichodroma> Good morning
2017-12-18 08:11:01 GMT <yreg> Good morning folks
2017-12-18 08:52:37 GMT <digcat> morning all
2017-12-18 14:19:12 GMT <Tichodroma> Do you know how to disable the Alfresco Solr Suggester?
2017-12-18 14:20:12 GMT <CptLuxx> ahh
2017-12-18 14:20:15 GMT <CptLuxx> easy i had this last week
2017-12-18 14:20:22 GMT <mrks_js1> i did it for solr1, there are some how tos (in the doc i believe)
2017-12-18 14:20:24 GMT <CptLuxx> olr4/workspace-SpacesStore/conf
2017-12-18 14:20:29 GMT <CptLuxx> solr4/workspace-SpacesStore/conf
2017-12-18 14:20:34 GMT <CptLuxx> and change the config
2017-12-18 14:20:42 GMT <CptLuxx> Changed from solr.suggester.enabled=false from true to false
2017-12-18 14:20:59 GMT <Tichodroma> let me try that ...
2017-12-18 14:28:07 GMT <yreg> Tichodroma, I think even if you do that, data would still be stored for suggestions, if I am not mistaken, we had to edit schema.xml to squeeze even more performance
2017-12-18 14:29:02 GMT <CptLuxx> wait what
2017-12-18 14:29:10 GMT <CptLuxx> i did that last week and it worked hmpf
2017-12-18 14:30:59 GMT <Tichodroma> actually I am not sure what the problem is that I am facing
2017-12-18 14:31:35 GMT <Tichodroma> so I am trying some settings
2017-12-18 14:32:15 GMT <Tichodroma> if Solr errors with: Namespace prefix cm is not mapped to a namespace URI
2017-12-18 14:32:56 GMT <yreg> that's probably a problem with the model
2017-12-18 14:33:10 GMT <Tichodroma> I guess the model XML is not available for Solr, alf_data/solr/model is empty
2017-12-18 14:33:24 GMT <Tichodroma> What could prevent Solr from fetching the model XMLs from Alfresco?
2017-12-18 14:33:50 GMT <angelborroy> No errors at SOLR log?
2017-12-18 14:34:26 GMT <Tichodroma> not while starting, the first error is the missing NS when I perform the search
2017-12-18 14:34:56 GMT <angelborroy> SOLR should send a request to gather the model
2017-12-18 14:35:04 GMT <Tichodroma> so I thought
2017-12-18 14:35:16 GMT <angelborroy> https://docs.alfresco.com/5.2/concepts/solr-overview.html
2017-12-18 14:35:18 GMT <alfbot> Title: Solr overview | Alfresco Documentation (at docs.alfresco.com)
2017-12-18 14:35:23 GMT <angelborroy> it includes “model” in the URL
2017-12-18 14:35:36 GMT <angelborroy> you can try to remove indexes and to start again
2017-12-18 14:36:03 GMT <angelborroy> and some request from SOLR to Alfresco should be registered both in SOLR log and Apache / Tomcat access log
2017-12-18 14:36:03 GMT <Tichodroma> bad idea, the index is HUUUUUGE and building is $$$ expensive (loading from The Cloud)
2017-12-18 14:36:11 GMT <angelborroy> wow
2017-12-18 14:36:15 GMT <angelborroy> so the problem is big
2017-12-18 14:36:31 GMT <angelborroy> you can try the URL by yourself
2017-12-18 14:36:38 GMT <angelborroy> to detect if it’s producing some problem
2017-12-18 14:40:00 GMT <angelborroy> btw I had also a weird problem last week
2017-12-18 14:40:05 GMT <angelborroy> I’m still studying it
2017-12-18 14:40:22 GMT <angelborroy> 100% CPU consumption by PDFBox
2017-12-18 14:40:35 GMT <angelborroy> When trying to index content from a PDF containing images
2017-12-18 14:41:03 GMT <angelborroy> Anyone heard about this problem before?
2017-12-18 14:48:40 GMT <yreg> Tichodroma, did you have a recent change in the model ?
2017-12-18 14:48:47 GMT <yreg> (alfresco side ?)
2017-12-18 14:54:10 GMT <Tichodroma> none
2017-12-18 14:54:52 GMT <Tichodroma> the system is very large and having low level hardware/storage problems
2017-12-18 15:01:52 GMT <fwu> hi all!
2017-12-18 15:29:37 GMT <fwu> angelborroy, I know that, but I would like to make specific js files for each ftl. So I would like to make an include, import, or something like that. As I do with js files inside bpmn files.
2017-12-18 15:30:08 GMT <fwu> right now, I need to compile the ftl in a jar file everytime I make a change
2017-12-18 15:30:10 GMT <angelborroy> fwu probably I don’t understand your question
2017-12-18 15:30:23 GMT <angelborroy> you can make a JS import in FTL
2017-12-18 15:46:32 GMT <fwu> I tried but it seems it doesnt work
2017-12-18 15:47:00 GMT <fwu> I tried this: <@script src=
2017-12-18 15:47:05 GMT <fwu> maybe this is wrong
2017-12-18 15:47:56 GMT <angelborroy> it’s fine
2017-12-18 15:48:03 GMT <angelborroy> fwu what is the problem?
2017-12-18 15:48:56 GMT <fwu> I cant see the js get from fiddler, so it is not running
2017-12-18 15:49:25 GMT <angelborroy> ok, so your JS is extenal to Share, right?
2017-12-18 15:49:33 GMT <angelborroy> extenal > external
2017-12-18 15:51:17 GMT <fwu> ah! that may be the problem.
2017-12-18 15:51:22 GMT <angelborroy> yes, it is
2017-12-18 15:51:45 GMT <fwu> I want internal, but im trying external... I should try as I need
2017-12-18 15:52:04 GMT <fwu> I will try that, thank you!
2017-12-18 16:02:06 GMT <yreg> mbui, is there such a thing for eclipse ?
2017-12-18 16:02:55 GMT <angelborroy> some colleagues of me are using this https://www.jetbrains.com/webstorm/features/
2017-12-18 16:02:56 GMT <alfbot> Title: Features - WebStorm (at www.jetbrains.com)
2017-12-18 16:03:04 GMT <Tichodroma> mbui: what is special about Rhino/Nashorn code? What kind of support for JS do you need?
2017-12-18 16:03:09 GMT <angelborroy> but it’s not IDEA, it’s a different environment
2017-12-18 16:06:10 GMT <mbui> Well, basically i'm just wondering if there's some plugins/IDE's that supports the import statements or the "for each" statements.
2017-12-18 16:06:51 GMT <Tichodroma> the import is special to Alfresco
2017-12-18 18:42:00 GMT <angelborroy> AFaust sorry to bother you…
2017-12-18 18:42:17 GMT <angelborroy> … had you any CPU 100% consumption due to PDF content extraction?
2017-12-18 18:42:41 GMT <angelborroy> Is the first time I’ve seen that and the customer said that it should be something “usual”
2017-12-18 18:43:03 GMT <angelborroy> I’ve had to port part of this method https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java#L170
2017-12-18 18:43:04 GMT <alfbot> Title: tika/PDF2XHTML.java at master · apache/tika · GitHub (at github.com)
2017-12-18 18:43:07 GMT <angelborroy> to Tika 1.6
2017-12-18 18:43:14 GMT <angelborroy> In order to solve the issue
2017-12-18 18:47:44 GMT <AFaust> angelborroy: Well, you are surely not asking IF I ever had such an issue, because then the asnwer would be yes.
2017-12-18 18:48:05 GMT <AFaust> More likely you are asking if I had the problem on a specific version of Alfresco with a specific constellation of PDF document, right?
2017-12-18 18:48:39 GMT <angelborroy> for me it’s with PDFs produced by CamScanner
2017-12-18 18:48:47 GMT <angelborroy> in both CE 5.1 & 5.2
2017-12-18 18:49:13 GMT <AFaust> I have had such situations at customers when the PDF was a bit too large / complex and the system to tight on memory, so I would get 100% due to GC overhead
2017-12-18 18:49:49 GMT <angelborroy> thanks
2017-12-18 18:49:54 GMT <angelborroy> this is not my scenario
2017-12-18 18:50:20 GMT <angelborroy> It’s a semi-infinite loop when Tika is extracting images
2017-12-18 18:50:58 GMT <angelborroy> In fact, the loop is infinite but it’s not detected by the JVM and in the end gets 100% of CPU
2017-12-18 18:51:20 GMT <angelborroy> it’s weird, it’s supposed that this should happen before
2017-12-18 18:51:32 GMT <angelborroy> But I cannot find any reference
2017-12-18 18:51:39 GMT <angelborroy> Thanks again AFaust
2017-12-18 18:52:56 GMT <AFaust> So it occurs on TIKA 1.6 and you patched that. What was the original lines where it did occur?
2017-12-18 18:53:23 GMT <angelborroy> let me find it
2017-12-18 18:53:23 GMT <angelborroy> Well, you are surely not asking IF I ever had such an issue, because then the asnwer would be yes.
2017-12-18 18:53:26 GMT <angelborroy> sorry
2017-12-18 18:53:32 GMT <angelborroy> https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java#L170
2017-12-18 18:53:33 GMT <alfbot> Title: tika/PDF2XHTML.java at master · apache/tika · GitHub (at github.com)
2017-12-18 18:53:38 GMT <angelborroy> This is the new implementation of the method
2017-12-18 18:53:55 GMT <AFaust> I mean, you ported something back to TIKA 1.6, which you certainly did not do blindly...
2017-12-18 18:54:06 GMT <angelborroy> yep
2017-12-18 18:54:12 GMT <AFaust> So the entire method you say...
2017-12-18 18:54:25 GMT <angelborroy> nope
2017-12-18 18:54:28 GMT <angelborroy> This is the original
2017-12-18 18:54:29 GMT <angelborroy> https://github.com/apache/tika/blob/1.6/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java#L295
2017-12-18 18:54:30 GMT <alfbot> Title: tika/PDF2XHTML.java at 1.6 · apache/tika · GitHub (at github.com)
2017-12-18 18:54:51 GMT <angelborroy> So I only took the “COSBase” detection
2017-12-18 18:55:04 GMT <AFaust> Ah - I can see how the method was wayyy shorter in 1.6
2017-12-18 18:55:08 GMT <angelborroy> Just to stop the recursion at https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java#L199
2017-12-18 18:55:09 GMT <alfbot> Title: tika/PDF2XHTML.java at master · apache/tika · GitHub (at github.com)
2017-12-18 18:56:40 GMT <AFaust> Ah - I see.
2017-12-18 18:56:42 GMT <angelborroy> I need to produce a PDF without personal data to open a issue in Alfresco (as Tika has solved this)
2017-12-18 18:57:05 GMT <AFaust> Hmm - if this would cause an infinite recursion I would expect some sort of StackOverflow at some point
2017-12-18 18:57:18 GMT <angelborroy> Yep
2017-12-18 18:57:31 GMT <angelborroy> It’s infinite but it’s not exactly detected by JVM
2017-12-18 18:57:56 GMT <AFaust> Unless - due to the recursion and for-loop, I can see how you could run into a memory issue before you run into the StackOverflow one.
2017-12-18 18:58:26 GMT <angelborroy> let me reproduce the problem (one sec)
2017-12-18 18:59:13 GMT <AFaust> It should be quite easy to reproduce this issue if the customer / you are able to identify the PDF being processed while CPU spikes to 100%
2017-12-18 18:59:33 GMT <angelborroy> yep, the customer is having the issue all the time
2017-12-18 18:59:45 GMT <angelborroy> and I have a document to produce it at will
2017-12-18 19:00:21 GMT <AFaust> Is it a document you could share (i.e. non-sensitive)?
2017-12-18 19:00:30 GMT <angelborroy> nope, this is the problem
2017-12-18 19:01:10 GMT <AFaust> Ok, I see that CamScanner has a BASIC (free) account. Let's see if I can reproduce this by scanning something with that app
2017-12-18 19:01:20 GMT <angelborroy> it should be nice
2017-12-18 19:03:17 GMT <AFaust> Since I already have my local 5.2 up to do some more testing and feedback on the trivial ALF-21963 I can do a quick test of this as well
2017-12-18 19:04:41 GMT <angelborroy> Are you using Android?
2017-12-18 19:04:48 GMT <angelborroy> I’m doing the same operation with iOS
2017-12-18 19:06:14 GMT <AFaust> Yes, Android
2017-12-18 19:06:51 GMT <angelborroy> in my case it was “multiple"
2017-12-18 19:08:53 GMT <angelborroy> I cannot get the same effect with iOS
2017-12-18 19:08:57 GMT <angelborroy> It works as expected
2017-12-18 19:10:39 GMT <AFaust> Grml - I hate it when you can't easily download files from such apps via USB, but have to go via Google Drive... such BS
2017-12-18 19:10:54 GMT <angelborroy> In Apple we have AirDrop
2017-12-18 19:11:05 GMT <angelborroy> to send the file directly fron the phone to the computer
2017-12-18 19:11:59 GMT <angelborroy> I’m going to try using the OCR feature
2017-12-18 19:13:16 GMT <AFaust> Do you always get the bogus title + description value "þÿ"?
2017-12-18 19:14:04 GMT <AFaust> But no 100% CPU with a simple test scan here...
2017-12-18 19:14:16 GMT <angelborroy> The smae here
2017-12-18 19:14:25 GMT <angelborroy> title and description are the name of the file
2017-12-18 19:14:39 GMT <angelborroy> yep, it works
2017-12-18 19:15:08 GMT <angelborroy> -= THIS MESSAGE NOT LOGGED =-
2017-12-18 19:20:13 GMT <angelborroy> AFaust I’ve annonymmised the file
2017-12-18 19:20:26 GMT <angelborroy> AFaust are you interested in testing the “virus”?
2017-12-18 19:22:49 GMT <AFaust> Can do - just got it
2017-12-18 19:24:14 GMT <AFaust> Ah yes, the CPU fan is picking up the pace...
2017-12-18 19:24:46 GMT <angelborroy> It’s an infinite loop
2017-12-18 19:24:54 GMT <angelborroy> It will not end
2017-12-18 19:25:14 GMT <angelborroy> But I don’t know what this document has
2017-12-18 19:25:21 GMT <angelborroy> Why is special?
2017-12-18 19:25:49 GMT <angelborroy> Applying the patch, it goes flui
2017-12-18 19:25:51 GMT <angelborroy> fluid
2017-12-18 19:25:52 GMT <AFaust> 11 levels of recursion currently...
2017-12-18 19:26:17 GMT <angelborroy> that is: an infinite recursion
2017-12-18 19:26:54 GMT <AFaust> Now 12 levels - so it will (at some point very long in the future) run into the StackOverflow
2017-12-18 19:27:06 GMT <angelborroy> probably
2017-12-18 19:27:20 GMT <AFaust> I'm getting a memory dump and will look at that
2017-12-18 19:27:22 GMT <angelborroy> but it waits a lot of time
2017-12-18 19:27:45 GMT <angelborroy> you’ll see that “extractImages” from PDF2XHTML method
2017-12-18 19:28:51 GMT <AFaust> Yes, that is what I meant with X-levels of recursion
2017-12-18 19:29:55 GMT <angelborroy> I’ll raise an issue at Alfresco (I have to expand that white boxes), but probably it will be discarded
2017-12-18 19:30:51 GMT <AFaust> "probably" => very likely
2017-12-18 19:30:57 GMT <angelborroy> yep
2017-12-18 19:31:04 GMT <angelborroy> I must use “likely”, I know
2017-12-18 19:32:33 GMT <angelborroy> Anyway you can confirm that you’ve never seen this behaviour, right?
2017-12-18 19:33:41 GMT <AFaust> So far, I had not seen this particular behaviour, correct.
2017-12-18 19:33:57 GMT <angelborroy> thanks, this is what I guessed
2017-12-18 19:34:53 GMT <AFaust> And the bug is really with the TIKA code - the PDXObjectForm.getResources() create a copy of the PDResources which contains the very same data for the next recursion step
2017-12-18 19:35:56 GMT <angelborroy> yep
2017-12-18 19:36:10 GMT <angelborroy> this is why I applied that “COS” patch
2017-12-18 19:36:26 GMT <angelborroy> but it looks like this is not a very common case
2017-12-18 19:36:27 GMT <AFaust> And the memory overhead of the recursion is so low that you will likely not get a memory error.
2017-12-18 19:36:38 GMT <AFaust> The only thing I don't understand is why it takes so long to recurse
2017-12-18 19:36:52 GMT <angelborroy> the same
2017-12-18 19:40:47 GMT <AFaust> One operation in PDResources appears to be extremely inefficient
2017-12-18 19:40:54 GMT <AFaust> PDResources.reverseMap
2017-12-18 19:42:38 GMT <AFaust> This is being called everytime the duplicated PDResources.getXObjects() is being called, so for every recursion step
2017-12-18 19:43:41 GMT <AFaust> This has nearly 98% of the CPU time
2017-12-18 19:43:58 GMT <angelborroy> my point is, why this only happens with that kind of document?
2017-12-18 19:44:16 GMT <AFaust> And most if it is due to a HashMap.put - that does not make much sense
2017-12-18 19:44:24 GMT <AFaust> Well, it depends on the PDF document structure
2017-12-18 19:44:51 GMT <AFaust> You need to have this PDXObjectForm in the first place
2017-12-18 19:44:56 GMT <angelborroy> Alfresco is patching Tika
2017-12-18 19:45:13 GMT <angelborroy> so probably my modification could be added to that patch
2017-12-18 19:45:16 GMT <AFaust> I've looked in the patched TIKA source the whole time...
2017-12-18 19:45:24 GMT <AFaust> Yes, it could / should
2017-12-18 19:45:54 GMT <AFaust> And they have also patched PDFBox which is the one where the inefficient reverseMap operation is in
2017-12-18 19:46:05 GMT <angelborroy> yep
2017-12-18 19:46:05 GMT <AFaust> Haven't found the Alfresco source for that patch though
2017-12-18 19:46:12 GMT <angelborroy> the same
2017-12-18 19:46:23 GMT <angelborroy> probably it’s in an inner SVN branch
2017-12-18 19:47:22 GMT <AFaust> Well, there is a sources ZIP in artifacts server - problem is that Maven only deals with a sources.jar, so again, boohoo Alfresco for bad source attachment management
2017-12-18 19:47:54 GMT <AFaust> Whoooot? And the sources.zip is 78 MiB in size? What the heck...?
2017-12-18 19:52:20 GMT <AFaust> Oh - if you don't stop it, it will escalate into multiple threads running the same file. Probably due to timeout on the SOLR side...
2017-12-18 19:53:27 GMT <angelborroy> Yep
2017-12-18 19:53:40 GMT <angelborroy> This is what it was happening in customer enviroment
2017-12-18 19:54:14 GMT <angelborroy> And in the end, the service was discontinued due to CPU consumption
2017-12-18 19:54:16 GMT <AFaust> DId you by any chance check any of the JIRA issues for 6.0 if TIKA is going to be updated?
2017-12-18 19:54:38 GMT <angelborroy> nope
2017-12-18 19:55:15 GMT * AFaust bashes his head against the table in frustration about Alfresco Nexus / artifact management
2017-12-18 19:55:26 GMT <AFaust> angelborroy: Guess what is contained in the sources.zip
2017-12-18 19:56:44 GMT <AFaust> No guess? Well, then I'm just going to tell....
2017-12-18 19:57:56 GMT <AFaust> Someone ZIPed the entire source code project on disk, and did so, after running "mvn package" (or something more elaborate), so all the (sub-)projects contain a "target" directory with all the Maven output and temporary data, including extracted files, WARs, JARs...
2017-12-18 19:59:14 GMT <angelborroy> heh
2017-12-18 19:59:33 GMT <angelborroy> 78 MiB
2017-12-18 19:59:48 GMT <AFaust> Unpacked it is actually ~190 MiB
2017-12-18 19:59:54 GMT <angelborroy> wow
2017-12-18 20:02:57 GMT <AFaust> Ok, the reverseMap() operation actually looks fine, so it must have something to do with some of the hashCode/equals implementations in the PDFBox value classes. But this is where I'll stop checking this.... Still have to write up an offer for some work that may keep me busy about a quarter of next year...
2017-12-18 20:03:55 GMT <angelborroy> thanks AFaust
2017-12-18 20:04:02 GMT <angelborroy> you’re my hero ;-)
2017-12-18 20:11:04 GMT <angelborroy> AFaust https://issues.alfresco.com/jira/browse/ALF-21970
2017-12-18 22:00:05 GMT <brian-int> hi, when configuring Alfresco v201711(6.0.0) on CentOS7, via the wizard, should one change the default 127.0.0.1 (loopback) port to the public IP address of the machine?
2017-12-18 22:00:31 GMT <brian-int> s/wizard/text wizard/
2017-12-18 22:01:50 GMT <brian-int> text mode installer instructions on the alfresco site doesn't say much about this option but the GUI version says a bit but isn't clear: http://docs.alfresco.com/community/tasks/simpleinstall-community-lin.html
2017-12-18 22:01:52 GMT <alfbot> Title: Installing Alfresco Community Edition on Linux | Alfresco Documentation (at docs.alfresco.com)
2017-12-18 22:02:28 GMT <brian-int> also, does it take DNS names, ex: hostname.domain.com? or do I have to enter in an IP?
2017-12-18 23:43:48 GMT <AFaust> brian-int: I never use any of the installers, so I don't know how these inputs are mapped. You need to configure an externally addressable DNS name for that value or any emails generated by the server will include incorrect links / URLs. The same goes for ports, specifically the HTTP/SSL ports
The other logs are at http://esplins.org/hash_alfresco