Skip to content

Support for pickling of finite automata #125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

EduardoGoulart1
Copy link
Contributor

@EduardoGoulart1 EduardoGoulart1 commented Jan 16, 2023

The initial use-case that I stumbled upon was to serialize DFAs with pickle. I would get an error because it was calling setattr, which is disabled even though I was setting the allow_mutable_automata flag.

If we allow for mutable automata, then one should be able to modify its attributes.

global_config.allow_mutable_automata = True
dfa_1 = DFA(
    states={0, 1},
    input_symbols={0, 1},
    transitions={0: {0: 1, 1: 0}, 1: {0: 1, 1: 1}},
    initial_state=0,
    final_states={1},
)

with open('dfa.pkl', 'wb') as f:
    pickle.dump(dfa_1, f)
with open('dfa.pkl', 'rb') as f:
    dfa_1_read = pickle.load(f)
assert dfa_1 == dfa_1_read

Edit: After some consideration, I decided to disable set/get attributes as this will mess with the validate function. Instead serialization is implemented directly with the setattribute/getattribute functions

@coveralls
Copy link

coveralls commented Jan 16, 2023

Coverage Status

Coverage: 100.0%. Remained the same when pulling 9a8326c on EduardoGoulart1:set_attribute_for_mutable_automata into 1dc2a59 on caleb531:develop.

@EduardoGoulart1
Copy link
Contributor Author

@caleb531 On a side note, I feel like the behavior for allow_partial is underspecified. For example, the following (no transition function for one of the states) works, but it's unclear what it means:

  dfa_1 = DFA(
      states={0, 1},
      input_symbols={0, 1},
      transitions={0: {0: 1, 1: 0}},
      initial_state=0,
      final_states={1},
      allow_partial=True,
  )

The code throws an error one tries to do dfa_1 == dfa_1

Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EduardoGoulart1 Wow, great catch—these are definitely cases that need to be considered.

However, it looks like the code coverage has decreased—can you please add tests for these new conditionals? Probably under tests/test_config.py would be the best place to add them.

@EduardoGoulart1
Copy link
Contributor Author

@EduardoGoulart1 Wow, great catch—these are definitely cases that need to be considered.

However, it looks like the code coverage has decreased—can you please add tests for these new conditionals? Probably under tests/test_config.py would be the best place to add them.

@caleb531 will do it later today. I will also to try to come up with a solution for the allow_partial issue

@EduardoGoulart1
Copy link
Contributor Author

@EduardoGoulart1 Wow, great catch—these are definitely cases that need to be considered.

However, it looks like the code coverage has decreased—can you please add tests for these new conditionals? Probably under tests/test_config.py would be the best place to add them.

For the allow_partial issue, there are two problems:

1- Iterating over all inputs symbols without checking whether they are part of the state's transition function:

    dfa_2 = DFA(
        states={0, 1},
        input_symbols={0, 1},
        transitions={0: {0: 1}},
        initial_state=0,
        final_states={1},
        allow_partial=True,
    )
    intersect = dfa_2 & dfa_2

The code above fails because somewhere we do for symbol in input_symbols: state_transition[symbol]. When state=0 and symbol=1 you get a key error.

2- States that have no transition function

    dfa_2 = DFA(
        states={0, 1},
        input_symbols={0, 1},
        transitions={0: {0: 1, 1:1}},
        initial_state=0,
        final_states={1},
        allow_partial=True,
    )
    intersect = dfa_2 & dfa_2

The code above fails because somewhere we do dfa.transitions[state], which gives a key error when state=1.

The first is a bug and we must rework the methods to correctly handle partial functions. The second is more on our internal representation and I would rather fix it by requiring that all states have a transition function (which could be empty).

My suggestion is to create two separate issues. Then I can pick them up later when I have time

@caleb531
Copy link
Owner

@EduardoGoulart1 So, after having given it some thought, I am actually seriously considering removing allow_partial altogether. It was well-intentioned, but creates a lot of complexity in the DFA class, and it becomes difficult to guarantee that the existing DFA operations will work correctly with partial DFAs (because, as you're finding, you need to test against multiple permutations of partial DFAs for every operation).

@eliotwrobson When you have a moment, what are your thoughts on this? Here is the original issue that prompted the introduction of allow_partial.

@eliotwrobson
Copy link
Collaborator

I think I'm in agreement, the allow partial option is really just a minor convenience that can be removed. If we wanted to instead, we could have the constructor add an extra dump state that represents the same operation as allow partial for missing transitions, but even that may be overkill. Using two different assumptions for the internal representation of the DFA causes more problems than it fixes.

caleb531 added a commit that referenced this pull request Jan 21, 2023
@caleb531
Copy link
Owner

@eliotwrobson All good points, and agreed. I think a sufficient replacement might be something like a DFA.from_partial method or something like that (which would accept all the same parameters as the DFA constructor, but with the added behavior to fill in the gaps with the throwaway state.

However, that gets tricky when the user can make up any state names they want, so we would need some algorithm to determine the name of the throwaway state that doesn't conflict with any of the states they've defined. But maybe we could have DFA.from_partial accept a required keyword parameter that make the user supply the name of that throwaway state. Then there's no ambiguity, and it would enable the user to make the resulting DFA consistent with the existing states from their partial DFA.

Unless we do something like retain_names=False where we just generate a DFA with our own state names. I hope you see what I'm getting at here—what are your thoughts?

@EduardoGoulart1 please feel free to chime in as well, if you so desire.

@eliotwrobson
Copy link
Collaborator

I think having an optional parameter for a throwaway state is fine. By default, we can just use the smallest nonnegative integer that doesn't conflict with the existing state names (I think something similar is used in some of the DFA algorithms)

@eliotwrobson
Copy link
Collaborator

Related to this, @EduardoGoulart1 do you think there's a way to enable pickling support for immutable automatons? I think it would be nice to always have, especially if people are making large ones.

@EduardoGoulart1
Copy link
Contributor Author

Related to this, @EduardoGoulart1 do you think there's a way to enable pickling support for immutable automatons? I think it would be nice to always have, especially if people are making large ones.

This PR should address exactly that... I changed it so that you don't need to set allow_mutable for pickling

@EduardoGoulart1
Copy link
Contributor Author

@eliotwrobson All good points, and agreed. I think a sufficient replacement might be something like a DFA.from_partial method or something like that (which would accept all the same parameters as the DFA constructor, but with the added behavior to fill in the gaps with the throwaway state.

However, that gets tricky when the user can make up any state names they want, so we would need some algorithm to determine the name of the throwaway state that doesn't conflict with any of the states they've defined. But maybe we could have DFA.from_partial accept a required keyword parameter that make the user supply the name of that throwaway state. Then there's no ambiguity, and it would enable the user to make the resulting DFA consistent with the existing states from their partial DFA.

Unless we do something like retain_names=False where we just generate a DFA with our own state names. I hope you see what I'm getting at here—what are your thoughts?

@EduardoGoulart1 please feel free to chime in as well, if you so desire.

@caleb531 and @eliotwrobson I don't think that the library should be guided by my specific needs, but allow_partial is actually useful for my use case. For instance, think about a "linear" DFA with a large alphabet. Not only is it more memory efficient, but it can also be more computationally efficient e.g. for minimization.

On the other hand, I'm not too deep into the codebase to know how much pain allow_partial causes us. My impression is that if we define its semantics properly, it won't be too bad.

@EduardoGoulart1
Copy link
Contributor Author

My suggestion is to focus on this PR (pickling of FAs) and make an issue where we can discuss support for allow_partial more deeply. I would like to take a look at the code before making a suggestion. Unfortunately, it's hard to find free time with conference deadlines, etc 😖

@EduardoGoulart1 EduardoGoulart1 changed the title Allows set/del attribute when automata are mutable Support for pickling of finite automata using __getattribute__/__setattribute__ Jan 22, 2023
Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EduardoGoulart1 Looks pretty good! Just need a few docstrings for consistency, and we'll be set.

@EduardoGoulart1
Copy link
Contributor Author

@EduardoGoulart1 Looks pretty good! Just need a few docstrings for consistency, and we'll be set.

Hopefully it's good to go now

Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EduardoGoulart1 Yep, looks good to me. Thank you!

@caleb531 caleb531 changed the title Support for pickling of finite automata using __getattribute__/__setattribute__ Support for pickling of finite automata Jan 25, 2023
@caleb531 caleb531 merged commit f79c6c9 into caleb531:develop Jan 25, 2023
@caleb531 caleb531 mentioned this pull request Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants