Segmentation exceptions
Thread poster: JNTrans
JNTrans
JNTrans
Poland
Sep 25, 2015

Hello,

I was wondering is it possible (using regular expressions) to set up a segmentation exceptions not to split specific paragraphs.

Let us say that we have a paragraph that is static in a document template and has a static translation with a different number of sentences:

Aaa bbb ccc. Ddd eee fff. Ggg hhh iii.

And I want memoq to import it as one segment. How should I go about setting up the rule/exceptions using regex?


 
msoutopico
msoutopico  Identity Verified
Ireland
Local time: 21:26
English to Galician
+ ...
ICE Sep 25, 2015

Why don't you just pre-translate those sentences with ICE matches?

 
msoutopico
msoutopico  Identity Verified
Ireland
Local time: 21:26
English to Galician
+ ...
exceptions Sep 25, 2015

If you really want to achieve that through by tweaking segmentation rules, you could:

1. Create a custom list #DoNotSegmentSentences# containing items

  • Aaa bbb ccc.
  • Ddd eee fff.
  • Ggg hhh iii.

2. Create the exception #DoNotSegmentSentences##!#[\s]+\p{Lu} to the segmentation rule #end##!#[\s]+\p{Lu}


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 23:26
English to Russian
Tick 'Automatically join and split segments for best match' Sep 25, 2015

1. When you meet the first sentence (Aaa bbb ccc.) in a paragraph comprising a number of sentences (Aaa bbb ccc. Ddd eee fff. Ggg hhh iii.), join all those sentences by Ctrl+J. Then translate them as a single unit and confirm.

2. Go to 'Operations' tab, Pre-Translate... - check the 'Automatically join and split segments for best match' box. This will make all such 'static' paragraphs below joined as you joined them in the first instance.

However, I do not understand why
... See more
1. When you meet the first sentence (Aaa bbb ccc.) in a paragraph comprising a number of sentences (Aaa bbb ccc. Ddd eee fff. Ggg hhh iii.), join all those sentences by Ctrl+J. Then translate them as a single unit and confirm.

2. Go to 'Operations' tab, Pre-Translate... - check the 'Automatically join and split segments for best match' box. This will make all such 'static' paragraphs below joined as you joined them in the first instance.

However, I do not understand why you may need this... memoQ treats them as 101%. Translating them as normal individual segments is 101% safe.



[Edited at 2015-09-25 18:38 GMT]
Collapse


 
msoutopico
msoutopico  Identity Verified
Ireland
Local time: 21:26
English to Galician
+ ...
101% = ICE Sep 26, 2015

101% is what I meant by ICE (In-Context Exact) matches. Sorry if it wasn't clear.

I like Stepan's suggestion better than mine

Cheers, Manuel


 
Oliver Walter
Oliver Walter  Identity Verified
United Kingdom
Local time: 21:26
German to English
+ ...
Manual segmentation Sep 26, 2015

I asked a related question in August 2011, about what I called manual segmentation. There were not many replies but they did include some information about MemoQ:

http://www.proz.com/forum/cat_tools_technical_help/204631-segmentation_rules_are_not_intelligent_manual_adjustment_is_vital.html
... See more
I asked a related question in August 2011, about what I called manual segmentation. There were not many replies but they did include some information about MemoQ:

http://www.proz.com/forum/cat_tools_technical_help/204631-segmentation_rules_are_not_intelligent_manual_adjustment_is_vital.html

That may be seen as an old discussion thread, but I think the topic is still relevant and I would still like to see the related information about other CAT software.
Collapse


 
JNTrans
JNTrans
Poland
TOPIC STARTER
Thank you Sep 28, 2015

First of all - thank you for all the anwsers. I will try them out and let you know.

However, I do not understand why you may need this... memoQ treats them as 101%. Translating them as normal individual segments is 101% safe.


There are two main reasons: to automate this and make it as full proof. Due to legal requirements the translation does not match the source and often translators feel the urge to ignore instructions and change the matches. And due to the fact that the translation has a different number of sentences creates wonky pair in the TM. Whole paragraph pairs would be more useful.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Segmentation exceptions






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »