Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createBordereauParcellaire - XSD are downloaded from external sources at each generation #548

Open
pmauduit opened this issue Jan 12, 2021 · 14 comments
Assignees

Comments

@pmauduit
Copy link
Member

pmauduit commented Jan 12, 2021

Rennes-Métropole asked us to investigate why the PDF generation of the "bordereau parcellaire" were taking a long time on their platform (generally more than 30 seconds). Here is a summary of what we noticed so far:

After having instrumented the cadastrapp JVM on the test line, we got the following results during a call to the tested webservice:

Screenshot from 2020-12-22 12-14-57
Screenshot from 2020-12-22 12-09-22

Digging a bit further using tcpdump on the running container, we discovered that the necessary XSD used to validate the GetCapabilities document were fetched each time.

Here is a list of queries which are made on inspire.ec.europa.eu (plain http with no tls, so "easily" captured using tcpdump or such similar tools):

http://inspire.ec.europa.eu/schemas/inspire_vs/1.0/inspire_vs.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/common.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/network.xsd
http://inspire.ec.europa.eu/2001/xml.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_bul.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_cze.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_dan.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_dut.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_eng.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_est.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_fin.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_fre.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_ger.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_gle.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_gre.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_hun.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_ita.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_lav.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_lit.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_mlt.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_pol.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_por.xsd
http://inspire.ec.europa.eu/schemas/common/1.0/enums/enum_rum.xsd

Note: using cURL to fetch them in an automated way takes only ~ 3 seconds, I would expect at least similar performances from the HTTP client from the JVM, I cannot explain why it takes more than 4 times the time needed (~ 16 seconds relying on our previous instrumentation, see previous screenshots, the JVM sampling should truly induce an overhead though).

By the way, one improvement could be to cache the XSD, as it seems possible to do so with GeoTools: https://docs.geotools.org/stable/javadocs/org/geotools/xml/resolver/SchemaCache.html

Another remark: I know that we don't have lessons to give on this topic in geOrchestra, but the GeoTools version used in cadastrapp is quite old (9.2, which was the same as the Mapfishapp one at the time of Cadastrapp development, as far as I remember). Testing a more up-to-date version might also improve things ? We are currently on 21.3 with mapfishapp:https://github.com/georchestra/georchestra/blob/master/pom.xml#L45

@landryb
Copy link
Member

landryb commented Jan 12, 2021

ugh. horrible. Whatever the fix, by all means +1000 on doing something to improve that :)

@pierrejego
Copy link
Member

@pmaudit yes I agree for GeoTools version, when we had developed this extension we wanted to have the same dependencies as Mapfishapp, but we never update it afterward.

For the latency, the strange thing is that on JDev environnement, printing "Bordereau parcellaire" took less than 3 seconds. I have to reinstall backend to do some more test to see if I have the same schema calls effect.

But anyway, updating Geotools and adding cache for schema is a good idea.

@landryb
Copy link
Member

landryb commented Nov 22, 2021

supposedly closed by fabc4db ?

@pierrejego
Copy link
Member

No what I have done is not enough. When testing I can't see any xsd in the temp folder.
I have never used SchemaCache, even if it's Automaticly configured, I think I need to declare something in geotools to use it.
If someone has an example, it could be interesting for me. I have check geoserver source code, but I did found where the enable it.

@MaelREBOUX
Copy link
Member

No improvement for us. Generating PDF is still slow.

@pierrejego
Copy link
Member

En modifiant l'url de cadastre.wms.url pour pointer sur le workspace de cadastrapp et pas tout le geoserver, cela corrige les lenteurs.

Mais à Rennes Métropole on a un soucis
J'en ai conclu que lorsque le Workspace est renseigné dans l'URL du WMS ( cadastre.wms.url=https://portail-test.sig.rennesmetropole.fr/geoserver/app/wms ), cadastrapp utilise le SLD par défaut de la couche (qui est transparent) et sinon ( cadastre.wms.url=https://portail-test.sig.rennesmetropole.fr/geoserver/wms ) il prend le paramètre envoyé par getImageBordereau ???

@pierrejego
Copy link
Member

En faisant plus de test il y a un message URL rejected en passant par le app
Il y a donc un blocage F5

@pierrejego
Copy link
Member

Test fait après déblocage 2min pour 41 parcelles sur portail test et 1 min pour 46 parcelles sur gis.jdev.fr

@pierrejego pierrejego modified the milestones: v 2.0, v 2.1 Feb 21, 2022
@pierrejego pierrejego self-assigned this Feb 21, 2022
@pierrejego
Copy link
Member

Continuer a essayer de mettre en cache le xsd mais surtout le getCapabilities si possible

@jusabatier
Copy link
Collaborator

jusabatier commented Mar 24, 2022

@MaelREBOUX

No improvement for us. Generating PDF is still slow.

Même après la montée de version de GeoTools ?

@pierrejego

No what I have done is not enough. When testing I can't see any xsd in the temp folder. I have never used SchemaCache, even if it's Automaticly configured, I think I need to declare something in geotools to use it. If someone has an example, it could be interesting for me. I have check geoserver source code, but I did found where the enable it.

Je sais pas si tu as pu avancer sur ça, mais au vu de : https://github.com/geotools/geotools/blob/main/modules/library/xml/src/main/java/org/geotools/xml/SchemaFactory.java#L96

Ne faudrait-il pas tout simplement définir au niveau de la JVM un -Dschema.factory.cache=<a definir> ?

@landryb
Copy link
Member

landryb commented Jun 24, 2022

Ne faudrait-il pas tout simplement définir au niveau de la JVM un -Dschema.factory.cache=<a definir> ?

testé localement avec le backend v2.0, je n'ai rien de caché dans le rept. Je n'ai pas l'impression d'avoir de tels ralentissements, il faut ~10s pour génerer un BP sur ma pf de dev avec un fond ortho venant de l'IGN.

@pmauduit c'était quoi ta commande tcpdump pour n'avoir que les urls externes ? tcpdump sur port 80 sur l'iface externe ?

@MaelREBOUX MaelREBOUX modified the milestones: v 2.1, v 2.2 Aug 22, 2022
@MaelREBOUX MaelREBOUX removed this from the v 2.2 milestone Aug 5, 2024
@MaelREBOUX
Copy link
Member

Note de suivi : à tester à Rennes APRÈS upgrade du backend 1.9 -> 2.2.

@landryb
Copy link
Member

landryb commented Aug 5, 2024

geotools a été upgradé:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants