Jump to content

Anyone else messing around with neural nets?


mister phlegm

Recommended Posts

For no apparent reason I got the urge to fire up python and try to hack one together over Easter holidays. I had no idea the maths was quite as simple as it (mostly) is, it's all accessible with newly-met numpy and my mind is blown again. Of course that's in theory, in practice I am a genuinely horrible programmer with a lot of ML gaps to fill too.

 

I didn't want to retread datasets or have to give it training feedback manually so I'm going to try and train it to play a variant of connect 4 I made up and am hacking out some code to implement. The board has 42 cells, same as C4 so the input layer is an array of 42 numbers in varying shades of 0,1,2 a grid with 6 columns and 7 rows in my head but one big column in code. The output layer is an array of twelve numbers, representing the possible moves. I want two layers in between and I'm not sure how big they should be for the problem at hand yet. I'll try 20 each when the game code is done, for over a thousand adjustables (numbers we multiply by or add to the previous column) total if I get how it works.

 

So the input gets multiplied by and added to these adjustables in a couple of distinct stages each with a function to provide some non-linearity (otherwise the NN would just learn linear regression regardless of the number and size of the middle layers) and throws out a column of what the agent thinks of the various move options. Subtract that column from some version of reality check (won, missed win, lost avoidably, didn't lose etc.) to see how each probability fared. Work out how much the weights and biases influenced each result and adjust. Work out how much the previous weights and biases affected things and adjust those too. Then take a new input and do it all again. and again.

 

Can a thousand and ninety two numbers really learn to play not-connect4 well enough to beat me? If not, how many numbers will it take?

  • Like 1
Link to comment
Share on other sites

Is this is question from a Code Academy course?

 

I Taught a Machine How to Play Connect 4 (Towards Data Science . COM)

 

 

Link to comment
Share on other sites

I really didn't want to re-solve an old problem or copy-type someone else's project, I find I don't learn much that way over just making stuff up and trying it out. Similarly I don't want to start by using Tensorflow or whatever, though it would probably speed me up.

As far as I am aware my C4 variant is original, and looks thusly:

 

Quote

Move please player 2
 

[[0 0 0 0 0 2]
 [0 0 0 0 0 2]
 [0 0 2 1 1 0]
 [0 1 1 2 0 0]
 [2 2 2 1 0 0]
 [1 1 0 0 0 0]
 [1 0 0 0 0 0]]

 

You can insert counters either above or below, and they 'fall' up or down to hit the underlines for a maximum of twelve possible moves over the seven of the original. Win conditions are unchanged.

 

The game code is done. Next up is formulating a reward function for the NN to learn from. So maybe something like win is +10, loss/invalid move -10, preventing an opponent win +5, missing winning move -5 or whatever. I'd like some small increment for prolonging the game, but may have to then modify the modifiers to make a quicker win yield more to balance. I originally intended to run the net on the game and train it as I went, but apparently it's better to record games and moves and hit the NN with a random ordering of them to minimise unwanted correlations, so I'll figure out how to do that too. I'm still learning what my options are.

 

My poor coding skills are slowing me down (I'll have to rewrite a bunch of my code just to implement the evaluations, that's how bad it is) but I have a reasonable grasp of the maths and I'd much prefer that to the other way round.

Link to comment
Share on other sites

Now I finally know what those ducks were all about, so I've learned something, thanks :). Maybe you could bob above the wreckage, help with the black box recovery. I've binned my first attempt and now have a more elegant game solution once it works properly, but have had a busier week than expected.

Link to comment
Share on other sites

Game code is rewritten and functions OK. NN code is written but untested aorn.

 

The input is a list of 43 numbers, the first giving the number of counters left to play and the rest representing a rectangle of six columns by seven rows. Each contains zero if no counter is there, 1 for player 1's counter and 2 for player 2. It is assumed it is always player 1s go, so in a game the counters are flipped after every turn.

The output is a list of twelve numbers, each representing how good or bad the net thinks each move is. It has no concept of invalid moves and will have to learn this as it goes.

 

Taking I as the input and O as the output, how do we get from one to the other? More numbers of course. I've put in a hidden layer of twenty nodes (numbers) to enable enough complexity to learn. So we go I to H to O essentially. It's the getting from I to H and the getting from H to O where the learning takes place and this requires one block of numbers and one list for each mapping from one layer to the next. One for I to H and one for H to O. I believe the latter is good for ducks :)

 

Our input is a list of 43 numbers. Our weights for the first layer W1 will be a block of numbers, 43 rows and 20 columns. Line up a row against an input. We multiply the first of each, then the second of each,... and add them all together to get a number. Each column will spit out one number, so we end up with a list of twenty numbers. Each number will have a bias added from our list of first biases B1. Then apply an activation function and we have the twenty nodes of H.

 

To get from H to O we start with those twenty numbers and want to end up with twelve. W2 will be twenty rows and twelve columns in consequence, B2 a list of twelve numbers. Apply another activation function and we have what the NN thinks of the possible moves.

 

W1, B1, W2 and B2 are the heart of the learning process, the adjustable parameters. 860+20+240+12= 1132 of them. To train them needs feedback so I've cobbled together a crude evaluation function based solely on the next move.

 

To train, we take the difference of each prediction from the evaluation, look at the contributions of each weight and bias to it and change them a bit to make future predictions closer to the training method (this means the NN won't be playing the game initially, just playing 'match my evaluation') so I'm not hoping for much more than avoiding invalid moves and making/thwarting wins-in-one.

Link to comment
Share on other sites

  • 4 weeks later...

I asked ChatGPT to write a Connect 4 game in Python for you..

 

import numpy as np

# Define the game board
ROWS = 6
COLUMNS = 7
board = np.zeros((ROWS, COLUMNS), dtype=int)

# Define a function to print the board
def print_board(board):
    print("\n")
    print(board)
    print("\n")
    print("  1 2 3 4 5 6 7")

    for i in range(ROWS):
        print("|", end=" ")
        for j in range(COLUMNS):
            if board[j] == 0:
                print(".", end=" ")
            elif board[j] == 1:
                print("X", end=" ")
            else:
                print("O", end=" ")
        print("|")

    print("|---------------|")
    print("| 1 2 3 4 5 6 7 |")
    print("\n")

# Define a function to check if the game is over
def game_over(board):
    # Check for horizontal wins
    for i in range(ROWS):
        for j in range(COLUMNS - 3):
            if board[j] == board[j+1] == board[j+2] == board[j+3] != 0:
                return True

    # Check for vertical wins
    for i in range(ROWS - 3):
        for j in range(COLUMNS):
            if board[j] == board[i+1][j] == board[i+2][j] == board[i+3][j] != 0:
                return True

    # Check for diagonal wins (top-left to bottom-right)
    for i in range(ROWS - 3):
        for j in range(COLUMNS - 3):
            if board[j] == board[i+1][j+1] == board[i+2][j+2] == board[i+3][j+3] != 0:
                return True

    # Check for diagonal wins (top-right to bottom-left)
    for i in range(ROWS - 3):
        for j in range(3, COLUMNS):
            if board[j] == board[i+1][j-1] == board[i+2][j-2] == board[i+3][j-3] != 0:
                return True

    # Check if the board is full
    if np.count_nonzero(board) == ROWS * COLUMNS:
        return True

    # If none of the above conditions are true, the game is not over
    return False

# Define a function to make a move
def make_move(player, column):
    for i in range(ROWS - 1, -1, -1):
        if board[column] == 0:
            board[column] = player
            return True
    return False

# Define a function to play the game
def play_game():
    print("Welcome to Connect Four!")
    print_board(board)

    player = 1
    while not game_over(board):
        column = input("Player " + str(player) + ", enter a column number (1-7): ")
        column = int(column) - 1

        if column < 0 or column >= COLUMNS:
            print("Invalid column number. Try again.")
            continue

        if not make_move(player, column):
            print("Column is full. Try again.")
            continue

        print_board(board)

        if player == 1:
            player = 
 

Yep. :yep: 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Privacy Policy Terms of Use