I’m engaged on an issue the place I have to extract information from textual content. I first thought-about utilizing common expressions, however a number of the information shouldn’t be in a format I’m positive learn how to deal with or even when regex is one of the simplest ways to deal with it. So, some strains are easy [fieldname]: worth newline. Sadly, others have nested information, akin to contacts. Right here is an instance:
Contacts: final replace 11/30/2015 10:25 AM (PST)
Dispatch and Operations: Mike (Dispatcher) (Major Contact) Cellphone: 111-111-1111 Fax: 111-111-1111 E mail: take a look email@example.com Proprietor or Officer: Jane Doe (President) Cellphone: 222-222-2222 Fax: 111-111-1111 E mail: take a look firstname.lastname@example.org
SERVICES: final replace 11/12/2016 03:41 PM (PST)
You will note it has a begin piece of textual content I can discover, however I solely need the 2 contacts beneath, excluding the final replace time. Moreover, the primary line of the contact is their title, not one thing I can rely on for sample matching since it’s a freeform area. Now, I may go line by line, however this is able to imply I have to onerous code this information into my code. I attempted Googling it, however I have not been capable of finding one thing that addresses this problem. So, what I hope is that somebody will help me with some path to assist me get again on observe. Thanks.