It’s actually not as tough as it sounds. Just read in the source code and split it into chunks using the quotation mark as the delimiter. The odd numbered chunks’ll be the text (counting the chunks from 0 not 1).
I’d probably do this like this.
with open("story.ni") as f:
for line in f:
spell_array = line.split('"')[1::2]
text_line = ' '.join(spell_array)
text_line = re.sub("\[[^\[\]]+\]", "", text_line)
I’ve stopped using readlines since I came across this blog. Also, I think “[^{Something}]+” is more efficient than “.*?” due to backtracking. I tend to use the former as it feels more rigid.
Also, I don’t think it will work correctly if you have text with actual non substituted paragraph breaks.
Sure! Why not! This is what I use.
#! /usr/bin/python
from re import sub
from sys import argv
from re import search
def join(array,delimiter=""):
return delimiter.join(array)
class Utilities:
def findunmatchedquotes(self,project):
with open("/home/[username]/Inform/Projects/"+project+".inform/Source/story.ni","r") as file:
textlist=file.read().split("\"")[1::2]
for item in textlist:
value=0
count=0
while item.find("'",value)!=-1:
index=item.find("'",value)
if index==0 or index==len(item)-1:
count+=1
elif search("\w",item[index-1])!=None and search("\w",item[index+1])!=None:
count+=0
elif item[index-1]=="[" and item[index+1]=="]":
count+=0
else:
count+=1
value=index+1
if count%2==1:
print(item)
def removewhitespace(self,project):
with open("/home/[username]/Inform/Projects/"+project+".inform/Source/story.ni","r") as infile:
with open("/home/[username]/Documents/Inform/Clean.txt","w") as outfile:
for line in infile:
if line[-1]=="\n":
outfile.write(line[:-1].rstrip())
outfile.write("\n")
else:
outfile.write(line.rstrip())
def split(self,project):
with open("/home/[username]/Inform/Projects/"+project+".inform/Source/story.ni","r") as file:
result=file.read().split("\"")
with open("/home/[username]/Documents/Inform/Code.txt","w") as file:
file.write(join(result[0::2],"\"\""))
with open("/home/[username]/Documents/Inform/Text.txt","w") as file:
file.write(join([item.replace("\n\n","[paragraph break]").replace("\n","[line break]") for item in result[1::2]],"\n"))
def merge(self,project):
with open("/home/[username]/Documents/Inform/Code.txt","r") as file:
code=file.read().split("\"\"")
with open("/home/[username]/Documents/Inform/Text.txt","r") as file:
text=file.read()[:-1].split("\n")
list=[None]*(len(code)+len(text))
list[0::2]=code
list[1::2]=text
with open("/home/[username]/Documents/Inform/Full.txt","w") as file:
file.write(join(list,"\""))
def checkfind(self,project):
with open("/home/[username]/Inform/Projects/"+project+".inform/Source/story.ni","r") as file:
for item in file.read().split("\n\n"):
if item[0:5]=="Check" and item[-8:]!="instead.":
print(item,end="\n\n")
def gettext():
argc=len(argv)
if argc<3:
print("ERROR - PARAMETERS NEEDED!!")
else:
keys=Utilities()
if hasattr(keys,argv[1]):
function=getattr(keys,argv[1])
function(join(argv[2:]," "))
else:
print("ERROR - NO FUNCTION FOR \""+value.upper()+"\"!")
def main():
gettext()
if __name__=="__main__":
main()
Feel free to use any bits you find useful!
I only use a class here so that I can get the “text to function” functionality of “getattr”. It also includes some other useful tidbits that I find handy from time to time and it’s easy to add another function for something else if something comes up in the future. It also merges as well as splits so that any spelling corrections can be easily put back in the source code.
I technically still have the C++ version of this I made about a decade ago somewhere in my archives before I learned Python. I’d be happy to dig it out if there’s any interest.