Debian i18n - Policies and documentation ======================================== !! XXX !! - Eventually this needs to go to a wiki, ikiwiki, HTML, SVN/git, somewhere so people can read it and know how it operates Index ===== * Internal policies and documentation * Decision from Debian i18n Sprint 2012 (Paris, France) * Database for DDTP * SVN -> git conversion Internal policies and documentation ----------------------------------- Useful information for people maintaining services in the machines * Server maintenance - Dependencies on other packages are expressed using meta-packages + debian.org-$yourservice.d.o + git+ssh://git.debian.org/git/mirror/debian.org.git + Created debian-l10n/debian.org.git should be used to sync with DSA . Created debian.org-ddtp.debian.org . Created debian.org-l10n.debian.org . Created debian.org-i18n.debian.org - It is allowed to use packages/modules from backports.debian.org - It is allowed to use non-packaged software + Avoid running stuff on personal directories ($HOME, public_html) + Avoid PHP, and be reasonable about what we are running + Do not mix both packaged & non-packaged scripts of the same software - Cronjobs + sudo -u debian-i18n; crontab -e + sudo -u ddtp; crontab -e - Both machines have incoming/outgoing email /srv/ddtp.debian.org/mail /srv/i18n.debian.org/mail - Configuration can be added/ammended thru puppet (git requests preferred) + git+ssh://git.debian.org/git/mirror/dsa-puppet.git - Python + Was used for Pootle, but we are dropping Pootle + May be required for . Weblate (django) . Transifex (django) . New DDTP/DDTSS (django+sqlalchemy) - Perl CGI + Memory is limited by DSA, it might need to be adjusted - Access to Debian mirror + NFS mounted mirrors at /srv/mirrors/debian - tye.debian.org + debian-i18n group has SSH access and apachectl + i18n.debian.org Services common to all teams or used to generate specific items, the rule of thumb is that the data availabe from i18n.d.o is used to build l10n.d.o + l10n.debian.org Services for individual teams (stats, coordination pages, robots) - dukas.debian.org + ddtp group has SSH access and apachectl + ddtp.debian.org DDTP - Debian Description Translation Project, includes the DDT Server and DDTSS (DDTS Satellite or Debian Distributed Translation Server Satellite), a web frontend to the DDTS. * .net domains (ddtp.d.n and i18n.d.n) - DSA (zobel) made: + ddtp.debian.NET owned by ddtp group + i18n.debian.NET owned by debian-i18n group * Guest accounts - DSA procedure: http://dsa.debian.org/doc/guest-account/ + Used to give Martijn van Oosterhout (kleptog, non-DD) access to i18n and ddtp machines * Reboots - DSA reboots debian.org machines ASAP after stable kernel updates + For i18n services this seems fine and is the current practice on churro + Scripts must be idempotent and try to recover by itselves + If needed, lockfiles can be used to delay reboot + NEEDS FIX: Translated package descriptions takes a long time (most of the day), so that will need to be improved, one way or another. Decision from Debian i18n Sprint 2012 (Paris, France) ----------------------------------------------------- * decide where to put the statistics: - website - i18n * prepare specification for - new gen-material (e.g. using lintian labs) - mail based coordination robot - use of pg instead of file database => plan a google summer of code on this topic * plan future ddtp updates * plan next i18n sessions * review / merge todo lists Database for DDTP ----------------- * Database is around 2.2GB on churro. - Isn't a big part of it disposable? + The most important part is only a few hundred MB. + Currently the largest table is statistic_tb (1057MB), this is used for statistics. The code that uses this is only present in the new DDTP/DDTSS codebase, grisu added the support. It is used to make graphs of progress for milestone for language teams, so they can see their progress over time. ddtp=> select relname, pg_size_pretty( pg_total_relation_size(oid) ) from pg_class where relnamespace=2200 and relkind='r' order by pg_total_relation_size(oid) desc; relname | pg_size_pretty --------------------------+---------------- statistic_tb | 1057 MB description_milestone_tb | 233 MB ddtss | 227 MB part_tb | 144 MB package_version_tb | 134 MB - Do we need to keep archived releases in the database? + Yes, maybe for translation memory, anything outside database is much more work to use . Doesn't necessarily if we don't use it + No, Translation files are static and translation memory can be stored in a different way * PostgreSQL is on a dedicated machine connected using pg_service PostgreSQL 9.1 Server: danzi.debian.org Cluster: debian-i18n Database: ddtp 'psql "service=ddtp"' * How to generate a PDF of the schema - virtualenv tmp - cd tmp - bin/easy_install sadisplay psycopg2 - bin/sadisplay postgresql:///ddtp | dot -Tpdf > schema.pdf SVN -> git conversion --------------------- Authors meta data: mapping the svn account to a From line to put in the git log barbier = Denis Barbier bubulle = Christian PERRIER elric-guest = Omar Campagne faw = Felipe Augusto van de Wiel fzt = Nicolas François grisu = Michael Bramer jfs = Javier Fernandez-Sanguino Peña kleptog-guest = Martijn van Oosterhout mquinson-guest = Martin Quinson nekral-guest = Nicolas François pmachard = Pierre Machard taffit = David Prévot themill-guest = Stuart Prescott thuriaux-guest = Thomas Huriaux pootle-guest = Pootle guest root = The knights of nicky-nicky (no author) = The knights of nicky-nicky Current organisation of scripts in churro ----------------------------------------- /srv/dl10n-stuff/svn/dl10n/ - SVN workcopy /srv/dl10n-stuff/svn/dl10n/data - status (l10n + nmu) data generated here /srv/dl10n-stuff/bin/ - cron jobs + utils + config /srv/dl10n-stuff/log/ - cron jobs logs (.log / .err for gen-testing / gen-unstable / spiderbts / dl10n-nmu) /srv/i18n.debian.net/cron/ - cronjobs (uses /srv/dl10n-stuff/bin/ scripts) (cron.d / cron.hourly / cron.monthly) /srv/i18n.debian.net/mail/ - not used currently. archive of l10n mailing lists for robot. rotation broken. /srv/i18n.debian.net/tmp/gen-compendia - TMP dir for gen-compendia (clean) /srv/i18n.debian.net/tmp/gen-material - TMP for gen-material (clean) /srv/i18n.debian.net/www/debian-l10n-material - contains the database with the translation status of the archive (data/unstable.gz / data/testing.gz); and the extracted material (menu / po / templates or testing/menu / testing/po / testing/templates) and the associated tar balls. This is used internally by other tools (e.g. dl10n-txt) or by www.d.o /srv/i18n.debian.net/www/debian-l10n-coordination - web pages for l10n teams /srv/i18n.debian.net/www/debian-l10n-coordination/l10n-nmu - web page for NMU campaign /srv/i18n.debian.net/www/l10n-pkg-status - data for the PTS /srv/ddtp.debian.net/ - script for handling DDTP /srv/ddtp.debian.net/ddtss/ - scripts for handling DDTSS /var/www/ddtp/ddt.cgi - main DDTP script /var/www/ddtp/ddtss/index.cgi - main DDTSS script Issues in current organization ------------------------------ cron jobs not in VCS (as templates) utils not in VCS (as templates) config templates not in VCS no log rotation DDTP/DDTSS web scripts running are not those in the working directory Data generated in VCS working copy No atomic change of data (data continuously updated in /srv/i18n.debian.net) --> in packages.d.o, rsync-mirror using hardlinks from the place where data are produced to the publication place Corruption prone scripts No backup /srv/i18n.debian.net/www/debian-l10n-coordination/l10n-nmu to be located elsewere Hardcoded list of languages (dl10n scripts and www.d.o) Abort scripts in case of mirror unavailability (through presence of /srv/mirrors/debian/project/trace/ftp-master.debian.org (workaround for possible automount issues) Proposed organization for scripts --------------------------------- /srv/i18n.d.o/dl10n/git/ - VCS working copy same as currently move compendia to ../../ [DONE] cron from bin directory [DONE] etc from bin directory [DONE] html renamed htdocs-static [DONE] pootle to be removed [DONE] /srv/i18n.d.o/tmp/ - TMP directories $programm/$programm.YYYYMMHH-HHMM /srv/i18n.d.o/log/ - log directories $programm/$programm.YYYYMMHH-HHMM /srv/i18n.d.o/dl10n/data/ spiderbts/ data/status. html/ gen-material/ data/ unstable.gz - translation status database for unstable testing.gz - translation status database for testing po/ unstable testing templates/... menu/... nmu-update/ data html /srv/l10n.d.o/htdocs/ coordination/ (link to dl10n/ata/spiderbts/html/) 00data (link to dl10n/data/spiderbts/data 01static (link to dl10n/git/htdocs-static) stats compendia /srv/i18n.d.o/htdocs/ material (link to dl10n/data/gen-material) nmu-radar (link to dl10n/data/nmu-update/html) spiderbts (link to dl10n/data/spiderbts) l10n-pkg-status -> l10n-pkg-status.YYYYMMDD-HHMM /srv/ddtp.d.o/ddtp/ git -> git repository log -> logs tmp -> temporary files (translation files prior to checks) inputs -> popcon, packages files not on mirror outputs -> transations files for upload home -> ddtp user home directory, should be empty etc -> contains apache config mail -> mail configuration? htdocs -> static web data Questions: cronjobs? compendia? www.d.o and pts.d.o to be informed churro & i18n.d.o to run in parallel i18n.debian.net/debian-l10n-material/testing i18n.debian.org/material/testing